What are ‘case uses’ and how do I read box plots?

I posted a new article on SSRN this week that presents the big-picture findings of an empirical study I’ve been doing of the argumentative practices of judges and advocates in 199 textual artifacts. (Page numbers here are for the version posted on SSRN on Feb. 18, 2020.) I’ve done the analysis (with the help of a dozen research assistants) over the last two years, and the findings raise some interesting questions. But before we can really talk about the findings, you need to know what it is we were looking for, the ‘case use.’ And because I presented many of the findings using box plots, you might want to understand them, too.

First, though, don’t freak out about the length of the draft! Of the seventy-eight pages (!), twenty cover research methods and many have tables of data, all probably to be relegated to appendices, of which there are already nine pages in the draft.

I was interested in doing a study to determine how judges and advocates used citations to cases in their arguments. I was not interested in doing another citation-count study, one where we ask “Did the author of this brief/opinion cite this other thing anywhere in this brief/opinion?” Those studies may have value in a variety of ways, but they don’t tell us anything about how the citing text used the cited text. (See pages 17–21.) As far as I can tell, no empirical study ever has. To do so, I developed the research construct of the ‘case use.’

A case use consists of citation to and discussion of a court opinion in a section of a legal argument to support the assertion of the author’s claim in that section of argument. We identified four substantive types of case uses that make rational argumentative appeals: supporting assertion of a rule, supporting a policy statement, supporting a generalization about prior cases, and functioning as an example of how a rule, policy, or standard is applied. A fifth type consisted of citing a case because the author quoted it. And a sixth catch-all type existed for types of use that did not fall into these categories. (page 5, emphasis added here)

For more detail regarding the coding of these categories, check out pages 34-38. (‘Coding’ here does not refer to writing computer code but rather to putting the codes or category tags described in the quote on particular case uses. There were more than 5600 such case uses in these artifacts, each coded by at least two people, requiring us to make more than 42,000 coding decisions and requiring us to discuss the nearly 15% of them where the coders disagreed. See the methods at page 34 and the online copy of the coding guide we used.)

For this study, I randomly selected fifty-five reported opinions of federal district courts between 2012 and 2018 that addressed fair use in copyright and 144 of the briefs that precipitated those same opinions. I analyzed only the fair-use sections of the arguments. This article provides a baseline of how often (expressed as relative frequency or frequency per 1000 words) the authors in the study used cases in these ways (page 43). It also compares judges to lawyers (page 48), prevailing lawyers to non-prevailing (page 52), lawyers of parties moving for relief versus those opposing motions (page 55), and authors in the copyright-rich SDNY versus authors elsewhere (page 57). Many of the differences are both statistically and practically significant, or at least so I argue.

I present many of the findings in the study using box plots. Here’s why: If I tell you that judges used cases to support rules on average 6.22 times per 1000 words and that advocates used them 4.92 times per 1000 words (see Table 5 at page 49), and if I told you what the standard deviations are and that the differences are statistically significant (p > 0.05), you’d probably get no sense of the meaningfulness of the size of the difference, and your eyes might glaze over. A box plot allows me to show the middle (‘median’) value of the range for each variable and then how the second and third quartiles are dispersed around it. Consequently, it’s much easier to make visual comparisons between typical ranges for each variable.

This example comes from Figure 3 (page 46), showing uses of case types across all 199 artifacts.

For each of the five categories of case uses, there are two white rectangles, each potentially with a horizontal line extending from it, and potentially some little dots even farther out in the distance. The black vertical line between the two white rectangles represents the median value for the variable. For the rule case uses here, that value is at 5.25 (see Table 4 at page 43 for details). The left white rectangle shows the range over which the second quartile of values (the 25% of the values just below the median) and the third quartile of values (the 25% just above the median) are dispersed. Together, the two white boxes represent the interquartile range (IQR) for the variable, which is between 3.26 and 6.9 for the rule case uses here. The horizontal lines, if present, show the range of 1.5 times the IQR, and the little black dots represent outliers, values in the data beyond the whiskers. I could have plotted the values on a curve, which would be a bell curve in the event that the values had a Guassian distribution—sometimes called a ‘normal’ distribution—but then the reader cannot see where the quartiles end.

Though it is somewhat arbitrary on my part, I’m referring to values within the IQR as ‘typical’ and those outside it as ‘unusually high’ or ‘low.’ It’s important to understand that I’m not calling the IQR the ‘normal range,’ and I’m not suggesting that the briefs and opinions that are outliers are wrong. The rhetorical view is that some situations, for example, might call for briefs with a lot more examples and so might be outliers on that metric without being bad briefs. But I suspect a brief’s author should keep their case uses in the typical range unless there is a good reason to go big or go small.

At this point, you know what you need to know to make sense of the comparisons I mentioned above in the article. Go check them out! I’d welcome your comments here or via email about what you see, why you think the differences are there, and anything else.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.