Summary
- A judge finds AI more helpful in a real case than some traditional tools in deciding the case.
- Despite its risks, AI holds promise as a tool for judicial decision-making.
Judge Kevin Newsom isn’t exactly an umpire, but he’s close to it. He’s a federal judge on the United States Court of Appeals for the 11th Circuit.
Because of his position, the legal community took notice of his 29-page concurring opinion in Snell v. United Specialty Insurance Co.—an opinion that can be read as a defense of the use of artificial intelligence (AI) by attorneys and even by the courts.
It’s easy—and becoming easier—to find court opinions attacking the use of AI by attorneys in their brief writing. A close reading of those opinions, however, reveals that it’s usually the misuse rather than the use of AI that’s the problem. Typically, the attorney has cited a nonexistent decision. Challenged by the court or opposing lawyer to explain the mystery citation, the attorney points the finger of blame at ChatGPT or some other AI-powered large language model (LLM).
But in each of those cases, the real fault lies with the attorney, for not only failing to read the decision being cited, but even failing to ascertain its existence.
Judge Newsom took the opportunity of a concurring opinion to explore whether ChatGPT and other AI-powered LLMs might be helpful in deciding a case. The answer was yes.
An Alabama family hired James Snell’s landscaping company for a backyard project that included the installation of a ground-level trampoline. Later, Matthew Burton sued the family when his daughter’s head struck the trampoline pit’s retaining wall, injuring her. He later added Snell and his company to the complaint, alleging negligent installation.
Snell’s liability insurer denied coverage based on two policy provisions. First, the only “specified operations” in the policy was “landscaping.” Second, in his application (which Alabama law makes part of the policy), Snell checked “No” to the question, “Do you do any recreational or playground equipment construction or erection?”
When the district court sided with the insurer, Snell appealed to the 11th Circuit. In one respect, the Snell case was ideal for an LLM experiment: The case turned in part on the meaning of an everyday term: landscaping. Also, there was no risk in the experiment because Snell’s answering “No” to the question about recreational and playground equipment was sufficient to assure a win for the insurer.
First, declaring himself a “plain-language guy,” Newsom recounted his analytical process. He consulted dictionaries but found their definitions unsatisfying. Must improvements be natural to qualify as landscaping? Must they be done for aesthetic reasons? And you can’t ask a dictionary whether landscaping includes trampoline installation.
He turned to LLMs and found them helpful—not fool-proof, but helpful. Why? He cited five reasons: They’re based on ordinary-language inputs, which is appealing to a plain-language guy. They can “understand” context, for example, distinguishing between a bat used in baseball and a bat that flies at night. They’re available to anyone with internet access. Their research is relatively transparent. And they’re more practical than other empirical methods, such as surveys.
Does the judge see drawbacks to relying on LLMs? Of course. One is the tendency of LLMs to “hallucinate,” making up authorities where none exist. Another is the underrepresentation of the language of people with little or no internet access.
On the whole, though, the judge found LLMs helpful, concluding,
… [T]his is my bottom line--I think that LLMs have promise. At the very least, it no longer strikes me as ridiculous to think that an LLM like ChatGPT might have something useful to say about common, everyday meaning of the words and phrases used in legal texts.