February 05, 2019

Artificial Intelligence Invades Appellate Practice: The Here, The Near, and The Oh My Dear

By Richard C. Kraus

What is artificial intelligence and what do lawyers and judges need to know about it?  The legal profession has already started to adopt new AI-based systems for uses such as legal research, document review, contract analysis, and outcome prediction.  Those are the “here” and “near” of AI in law.  As with many new technologies, AI has been overhyped, creating false expectations and fears that legal skills, human judgment, experience, and interpersonal skills will someday become outdated.  Speculation about the “oh my dear” uses of AI in legal and judicial practice remains an interesting topic for technology-conscious lawyers.  But the more fantastic ideas such as using AI to objectively decide cases by analyzing facts and applying law—satirized in a Daily Show skit about a trial with the Honorable Amazon Alexa, presiding—are still figments of creative imaginations.

This panel presented an interesting discussion focused on AI-based systems that have already been implemented by lawyers and judges, using two existing legal research systems to demonstrate the potential benefits in appellate practice. 

Legal research has changed fundamentally over the past twenty years, both in the processes and techniques used to find and analyze law, and in the substantive content available to researchers.  The shift from reading case reporters, annotated statutes, and topic digests to online searching with Westlaw and Lexis was transformational for courts and practitioners. Research was no longer limited to resources available in a firm’s library or at nearby law schools and public law libraries.  Depending on the subscription plan, researchers have access to a vast body of federal and state statutes and cases, general and specialized treatises, law reviews, and more.  Rather than perusing digests and reading cases, researchers craft lengthy Boolean searches.

The next step in legal research was explored as Pablo Arredondo, co-founder of Casetext, and Thomas Hamilton, V.P. of Strategy and Operations at ROSS Intelligence, demonstrated two competing artificial-intelligence legal research systems.  Raina Haque, Professor of Practice of Technology at Wake Forest University School of Law, provided insight into the benefits and other uses of AI in legal practice and warned about the subtle influence of algorithmic biases and errors.  Hon. Colleen O’Toole, from the Ohio Court of Appeals, moderated the panel, offering the perspective of both a former private attorney who started an advanced on-demand language interpretation services company and a current judge who depends on thorough and reliable legal research.

Judge O’Toole introduced the session by asking the attendees to identify themselves as digital natives (individuals who grew up with technology), digital immigrants (individuals who learned and implemented technology as adults), and digital tourists (individuals who use technology for tasks such as email and browsing).  The audience was mostly natives and immigrants, with only a few tourists.

Pablo Arredondo began the discussion by providing a history of artificial intelligence, defining the term as technology able to perform tasks traditionally performed by humans.  AI systems use “machine learning” and “natural language.”  Machine learning describes a system that can learn on its own from processing data, resulting in improving its performance of specific tasks without being explicitly pre-programmed.  Natural language processing describes a system that mimics an understanding of human language and tries to interpret meaning and intent.  At the current stage of AI’s development, the fundamental obstacle is the inability to effectively understand human language.  AI is best at recognizing patterns and implementing logical functions, as demonstrated by chess systems that easily defeat human grand masters.  AI can also effectively carry out facial recognition. 

AI systems, in Arredondo’s view, are pitiful at understanding language and recognizing the meaning and nuance of words.  Efforts to use AI in everyday life, such as Siri and Alexa, can be helpful but are often frustrating.  As a simple example, Arredondo indicated that a system cannot always determine if the word “apple” refers to a fruit or a computer.  AI can try to gain an understanding by examining the context of a word’s use and looking for relationships, but is not always successful.  Despite these existing limitations, an AI-based legal research system can be more effective and efficient than the brute-force technique of searching for a statistical distribution of words in proximity, filtered by topic and jurisdiction, through Boolean searches. 

Arredondo demonstrated the Casetext research system by uploading a brief filed in litigation relating to the status of Uber drivers as employees or independent contractors.  After applying algorithmic analysis to the brief, Casetext can find relevant cases that are not cited in the brief, both supporting and opposing an argument, by recognizing relationships to legal concepts and principles.  The difficulty with legal research systems such as Westlaw and Lexis is that the software does not “understand” law.  Both systems identify precedent by looking for the most cited cases and frequently used words, which Arredondo described as the “wisdom of the crowd.”  Effective search results depend on properly constructed Boolean searches.  But it is hard to know whether cases are missed.  Search results tend to be overinclusive or underinclusive.  Overinclusiveness is a problem because researchers do not have time (or clients willing to pay) to review hundreds of cases.  The risks of missing key law with underinclusive search results are obvious.

A research query is entered into Casetext in plain language rather than a complex search string.  The system uses machine learning and natural language processing to understand the intent of the question, identify legal authorities relevant to the question, and provide answers in context.  AI systems benefit from the way attorneys and judges create a corpus of law.  Attorneys include string cites to support a proposition.  Judicial decisions discuss related cases together, pointing out similarities and distinctions.  Along with case law, statutes, and secondary authority, Casetext has 2.8 million briefs in its system.  Relationships can be identified by locating common phrases such as “well established” or “well settled,” which are followed by the legal principles and cited cases.  Casetext also finds relationships from parenthetical citations that summarize cases.  Algorithms are applied to briefs, decisions, and other resources to identify relationships between cases and legal principles, which plays into AI’s strength in identifying and correlating relationships between data points.

Arredondo explained how this differs from case citation systems such as Shepard’s that do not identify context.  For example, Lexis and Westlaw often disagree about the citation flags for a specific case.  If a case is flagged as overruled or questioned, attorneys will commonly not look at the case even though it might offer insight or help for their position.  AI will alert attorneys to the relationship of legal principles in those cases with the question being researched. Arredondo noted that AI can exploit these tendencies for legal informatics, but cautioned that it cannot replace lawyers because of the obstacle of understanding language.  Language derives meaning from more than its definition.  Even the most sophisticated AI analyses cannot recognize nuance in usage, such as sarcasm in a dissent. 

Judge O’Toole talked about the risk of programming bias because AI is bad at understanding the “silos” in which cases can be placed.  Like Boolean searching, AI may not search across unrelated genres.  She remarked that AI is like fire: it is a good tool but should not be left unattended. 

Judge O’Toole asked about “outcome dangers,” and in particular, why Casetext is not as good as a law clerk, leaving the court and staff free to engage in judicial functions.  Arredondo referred to Noam Chomsky’s comment about AI’s still-primitive understanding of language.  Because Siri and Google are commonly used as prime examples of AI’s shortcomings, both Judge O’Toole and Arredondo were optimistic that difficulties in understanding language will be overcome.  The potential financial returns from solving language are unfathomable, providing enormous incentives for solving the language obstacle. 

Thomas Hamilton demonstrated the competing Ross Intelligence legal research system.  He described AI as software that does something humans can do.  He drew comparisons about AI’s understanding of language from his experience going to law school at McGill University in Montreal, where attorneys and judges are experienced with the bilingual interpretation of language in English common law and French civil law.  Legal principles and reasoning written in French will be read and interpreted by English speakers and vice versa.   

With the Ross system, a lawyer can ask legal questions in general terms.  Hamilton’s example involved the fraud on the market doctrine and its effect on securities fraud litigation.  The Ross system does not use a brute force approach based on key word matching like Lexis or Westlaw.  It is a machine language system that gains a deeper understanding from experience.  The system shows relevant highlighted sections from cases as well as incidental discussion of the principles and questions.  Researchers are prompted by reading the incidental discussion to identify related topics and develop new ideas for different approaches and arguments.  Hamilton cited studies indicating that AI systems can save from 50% to 75% of the time needed for research.  He used another sample inquiry to show how the system displays prompts designed to identify relevance and identifies cases with matching facts and procedural contexts.

Hamilton said that the system of developing headnotes for cases leads to human error in summarizing decisions, and can also reflect the writer’s generalized bias.  In contrast, an AI system can write its own summaries tailored to the specific legal question being researched.  Targeted overviews summarizing decisions can indicate whether the court adopted and applied a relevant legal doctrine or merely used key language.  This can avoid the practice of lazy researchers citing case based on terms highlighted by Lexis or Westlaw searches without reading the entire decision to understand the legal or procedural context.  Hamilton and other panelists, however, warned that the risk of “coasting” by researchers exists with both traditional and AI-based systems. 

Professor Haque described herself as a digital native with experience in creating neural networks.  She discussed the significant risk of overreliance on AI systems by comparison to her experience growing up after word processors became common.  She is a very poor speller because she has always relied on spellcheck systems.

Haque views legal research as a limited use of AI.  Because AI operates by analyzing relationships (“latent semantic indexing” is the technical term), systems can categorize information together for recall, resulting in efficiency, accuracy, and precision.  However, AI does not capture nuance in language.

In appellate practice, AI can also conduct factual research by identifying the determinative facts in cases.  In other legal-related settings, AI has been used for arbitration by reporting that cases tend to settle in a certain range when dependent on similar facts in related legal and procedural contexts.  AI systems have also been used for court transcription and drafting orders which are mostly boilerplate.  Haque described AI as a moving target.  Machine learning is on the bleeding edge of technology, and allows an AI system to learn from its own experience.

Haque explained a major pitfall of AI: the algorithms employed by a system are shaped by the data used to create them.  If the data is not well maintained (“dirty data hygiene”), the system can misread or misapply the data when executing algorithms.  Another concern is that AI systems can mirror what the coder is already thinking.  Humans decide which data to use for creating algorithms. Training data can influence the results and outcome.  The salient data can incorporate cultural or human biases.  Language cannot be perfectly modeled mathematically.  Algorithmic bias will amplify any cognitive bias or deficits programmed into the system.

Haque discussed ways to mitigate risks of AI in appellate practice.  These include best practices in software engineering.  Computer code should include comments explaining the process selected by the coder.  The code should be open and available to users.  Data used to create algorithms and the test cases used to test the algorithms should also be available.  She proposed requiring systems that employ advanced algorithms to classify the input and give a plain English explanation of how results are achieved.  If a user cannot scrutinize these aspects of an AI system, it should not be used.  This is especially true when the systems are being used by judicial and public bodies.

Questions from the audience explored other potential uses of AI systems.  The panelists agreed that AI would not be effective for determining the sufficiency of pleadings by comparing the complaint to other cases.  Although AI can identify and compare allegations used to state particular causes of action, the outcomes cannot be meaningfully analyzed.  The decisional approach and judgment used by judges is too subjective and evaluative.  The panel also agreed that AI systems are not good at examining docket sheets to identify cases with certain subjects or procedural settings.  A panelist described an attempt to use AI to parse data in dockets to identify larceny cases involving search and seizure motions that resulted in a guilty plea or verdict.  The results were very poor.

The panel agreed that AI can identify a judge’s decisional tendencies, such as plaintiff versus defendant leanings or sentencing patterns.  The concern, however, is that patterns can be identified but AI cannot explain the background or reasoning for decisions, an interesting point related to using AI to predict outcomes in cases where a party wants to obtain litigation financing. 

The explanation of AI-based legal research in appellate practice was a helpful introduction and provided an interesting glimpse into technology that has the potential for changing the legal profession in this and many other areas.  


Richard C. Kraus

Richard C. Kraus is a shareholder with Foster, Swift, Collins & Smith, P.C. in Lansing, Michigan.  His practice over the past 40 years includes extensive experience with appeals in state and federal courts. He was named as 2006 Lawyer of the Year by Michigan Lawyer’s Weekly for his appellate representation of the University of Michigan and received the Distinguished Brief Award for Exceptional Appellate Advocacy before the Michigan Supreme Court.  He is a member of the Council of Appellate Lawyers’ executive committee and the State Bar of Michigan Appellate Practice Section Council.