Skip to content

Language and the Law

Thank You to Our
Partner & Sponsor

S.J. Quinney College of Law

Thank You to Our Sponsors

Tanner Humanities Center

College of Humanities


Language and the Law Forum Schedule 2021

Date: Friday April 16, 2021 | Time: 9:00am - 12:30pm

9:00 AM   Opening Remarks, Professor Scott Jarvis and Dean Elizabeth Kronk Warner, University of Utah


A Corpus Linguistic Approach to Quantifying Surplusage in Statutes

Statutory interpretation often relies on linguistic canons of construction, or widely accepted ‘rules of thumb for interpreting the language. One such canon that is frequently appealed to is the ‘surplusage’ canon which, according to Cooley (1988), states that “The courts must […] lean in favor of a construction which will render every word operative, rather than one which may make some idle and nugatory”. This legal canon has a counterpart in linguistics, known as the Maxim of Quantity: “Do not make your contribution more informative than is required” (Grice, 1975). In spite of these widely cited expectations, it is also generally accepted that “legal drafters often include redundant language on purpose to cover unforeseen gaps or simply for no good reason at all” (Jellum, 2008; see also Scalia & Garner, 2012). As a result, it is not uncommon for questions of statutory interpretation to hinge on whether a phrase or provision contains language that is redundant or superfluous. In spite of this, there are currently no reliable methods for detecting possible surplusage in statutes. In this talk, I present a new method for determining whether there is linguistic evidence that words in binomials (e.g. care and support, liens and claims, null and void) violate the canon of surplusage (i.e. are semantically redundant), and should thus not be assigned independent meanings. This method relies on linguistic methods—applied to a corpus (a large sample of naturally produced language)—and is designed to reliably quantify the degree to which binomials are formulaic and semantically similar. This new method is applied to a set of binomials that have been the subject of dispute in previous legal cases. I will conclude by discussing this new method with relation to the role of linguistic data and judicial discretion in statutory interpretation.

Statutory Originalism and Language

The term "originalism" is usually applied to theories of constitutional interpretation, but the two core ideas that unify the originalist family of constitutional interpretation and construction (fixation and constraint) apply with equal force to statutes.  There are, however, differences.  The situation of constitutional communication differs in significant respects from the situation of statutory communication.  One important difference concerns the intended readership of constitutions and statutes.  Constitutions are written for the public and the meaning of the constitutional text is its "public meaning."  Some statutes are aimed at the public, but others are directed at more specialized audiences.  For example, regulatory statutes are usually written for regulatory agencies and the businesses that will be subject to the regulations.  For this reason, the "plain meaning" of a statue may not be its public meaning.

Legal and Linguistic Approaches to Genericity: Perspectives and Protocols in Trademark Disputes

Recent decades have seen an increase in expert testimony given by linguists before the United States Patent and Trademark Office (USPTO) and in state and federal courts in the United States (Butters 2010). Previous linguistic scholarship on the topic of trademarks has mostly discussed the broad role of linguistics in the legal assessments of trademarks (e.g., Shuy 2002; Shuy 2012); specific questions within trademark disputes such as dilution (e.g., Butters 2008; Popoola n.d.); and the general benefits of certain sources of evidence such as dictionaries, surveys, and corpora in the analysis of trademark questions (e.g., Hotta and Fujita 2012; Kilgarriff 2015; Ullrich 2018). However, one of the primary questions in such cases is one of genericity: Is, or was, a term generic?

This talk outlines the legal and linguistic approaches to genericity, highlighting the multiple ways linguists approach the concept of genericness. For instance, as proper nouns, trademarks should possess certain features differentiating them from common nouns – e.g., proper nouns are not pluralized or preceded by an article and are written with initial capital letters (Kilgarriff, 2015), whereas common nouns can be inflected for number and can follow an article or preposition (Crystal 2008; Finegan 2015). In order to investigate such use, linguists have relied on a variety of resources – dictionaries of various types (e.g., contemporary general English dictionaries, industry specific dictionaries); on corpora varying widely in kind, size, and structure (e.g., specialized corpora of regional American English, monitor corpora such as the Corpus of Contemporary American English); or on other sources of evidence (e.g., consumer reviews, GoogleTrends). However, analyses of these resources have demonstrated that users of language do not always follow such categorical linguistic distinctions when referring to noun classes. Given the essential role played by decisions of genericity and descriptiveness – the two categories within the trademark paradigm that receive no to little protection from the USPTO – this paper presents two case studies that exemplify ways in which linguists have investigated the varied, sometimes conflicting, linguistic information in dictionaries, corpora, and select other resources in order to address questions about the generic status of putative protected terms.


Butters, R. R. 2008. A linguistic look at trademark dilution. Santa Clara Computer & High Tech. LJ 24: 507.

Butters, R. R. 2010. Trademarks: language that one owns. In M. Coulthard and A. Johnson (eds.) The Routledge Handbook of Forensic Linguistics. Routledge: 351–364.

Crystal, D. 2008. A Dictionary of Linguistics and Phonetics, 6th edn. Blackwell.

Finegan, Edward. 2015. Language: Its Structure and Use, 7th edn. Cengage.

Hotta, S., and Fujita, M. 2012. The psycholinguistic basis of distinctiveness in trademark law. In P. M. Tiersma and L. M. Solan (eds.) The Oxford Handbook of Language and Law. Oxford University Press: 478–488.

Kilgarriff, A. 2015. Corpus Linguistics in trademark cases. Dictionaries: Journal of the Dictionary Society of North America 36: 100–114.

Popoola, O. (n.d.) A dictionary, a survey and a corpus walked into a courtroom...: An evaluation of resources for adjudicating meaning in trademark disputes. Corpus, 600: 1721.

Shuy, R. W. 2002. Linguistic Battles in Trademark Disputes. Springer.

Shuy, R. W. 2012. Using linguistics in trademark cases. In P. M. Tiersma and L. M. Solan (eds.) The Oxford Handbook of Language and Law. Oxford University Press: 449–462.

Ullrich, Q. J. 2018. Corpora in the Courts: Using Textual Data to Gauge Genericness and Trademark Validity. Trademark Rep.108: 989.

Corpus Linguistics in the Courts: Critiques, Responses, and the Path Forward

Lawyers and jurists have long sought to discern the “ordinary meaning” of the language of the law. In interpreting statutes, constitutional provisions, and contracts, our courts claim to be applying the “plain” or “ordinary” meaning of legal language. But judicial tools for discerning such meaning have long fallen short. The judge’s traditional toolbox includes dictionaries, etymology, and old-fashioned judicial intuition—which may have a role but all fall short.

In the past decade a few judges have begun to utilize additional tools, borrowed from the field of linguistics, to sharpen this inquiry. In opinions in a few state supreme courts and federal courts of appeals, judges have proposed to utilize corpus linguistic tools of collocation analysis and concordance line analysis to assemble transparent evidence of ordinary usage of legal language. Because premises of “ordinary meaning” seem to turn on actual usage of language by ordinary people, judges have suggested that the law’s assessment of ordinary meaning should be informed by statistical analysis of actual language usage in naturally occurring samples of language—in corpora like the Corpus of Contemporary American English, the News on the Web Corpus, or the Corpus of Historical American English.

This move has prompted a series of critiques and concerns. Some judges have suggested that there is a judicial ethics problem with a judge conducting his own corpus linguistic analysis without the benefit of expert witness testimony. Others have asserted that the data assembled from a corpus cannot, in any event, inform the “ordinary meaning” questions posed in the law. And some commentators have questioned the statistical or scientific relevance or salience of corpus analysis in law, suggesting the possible need for alternative approaches—the use of different corpora, or other means of empirical inquiry (such as the use of human-subject surveys).

This paper summarizes these developments and describes and responds to critiques of the corpus linguistics movement in the courts. It first explains that judges are as ethically free to use corpus linguistics tools to inquire into ordinary meaning as they are to consult various dictionaries or perform historical research into the original meaning of a provision of the Constitution. It then concedes that refinements in corpus methodology are needed to improve on the utility of these methods in the law of interpretation, but notes that these tools fare better than any other set of tools used to date by judges. In conclusion, the paper highlights misunderstandings in the empirical criticisms of the use of corpus methods, as well as shortcomings in the proposed use of human-subject surveys in this field.

12:15 PM Closing Remarks - Professor William Eggington, Brigham Young University

12:30 PM Adjourn

Last Updated: 4/16/21