Generative artificial intelligence (GenAI) offers the potential to greatly increase the automation of routine tasks that traditionally are repetitive and time-consuming for lawyers. Particularly important in this regard are large language models (LLMs), a type of GenAI that is trained to understand and process human language by learning patterns and associations from large amounts of text data. Key capabilities of LLMs include (1) the ability to generate textual responses similar to humans along with contextual awareness and (2) strong problem-solving and decision-making abilities using text-based information for tasks. (While LLMs are a subset of GenAI tools, for the rest of the article, the terms GenAI and LLMs are used interchangeably.)
From expedited document review to refined contract generation, talent acquisition, conflict resolution, and legal research, LLMs are revolutionizing how legal professionals approach their work.
GenAI applications in law practice are still maturing, however, and are not without risk.
“Hallucination” and Untrustworthy Results
In one of the best-known examples of this risk, plaintiff’s counsel in Mata v. Avianca, Inc., 22-cv-1461 (PKC) (S.D.N.Y. Jun. 22, 2023), were found to have submitted a brief that contained references to nonexistent cases. The two attorneys were sanctioned after admitting that some of the nonexistent cases and references could be attributed to ChatGPT, the LLM they had used for their research, and that they were unaware that its content could be false.
As the example above demonstrates, LLMs have been shown to “hallucinate,” responding to prompts with information that is entirely fabricated. And even when the results are factual, they might not be current; laws are constantly updated and reinterpreted by the courts. Legal professionals who employ LLMs must adopt strict protocols for cross-checking and verifying the information provided by LLMs.
Confidentiality and Data Security
Additionally, the importance of client confidentiality in legal work raises legitimate questions about the use of open-source LLM applications such as ChatGPT in the legal industry. Attorneys face the risk of disclosing confidential data when feeding information about cases, clients, and even law firm personnel into an LLM.
LLMs present several novel security threats. The Open Worldwide Application Security Project (OWASP) recently released its list of the “Top Ten” threats:
1. Prompt injection. Manipulating LLMs via crafted inputs can lead to unauthorized access, data breaches, and compromised decision-making.
2. Insecure output handling. Neglecting to validate LLM outputs may lead to downstream security exploits, including code execution that compromises systems and exposes data.
3. Training data poisoning. Tampered training data can impair LLM models, leading to responses that may compromise security, accuracy, or ethical behavior.
4. Model denial of service. Overloading LLMs with resource-heavy operations can cause service disruptions and increased costs.
5. Supply chain vulnerabilities. Depending upon compromised components, services or datasets undermine system integrity, causing data breaches and system failures.
6. Sensitive information disclosure. Failure to protect against disclosure of sensitive information in LLM outputs can result in legal consequences or a loss of competitive advantage.
7. Insecure plugin design. LLM plugins processing untrusted inputs and having insufficient access control risk severe exploits like remote code execution.
8. Excessive agency. Granting LLMs unchecked autonomy to take action can lead to unintended consequences, jeopardizing reliability, privacy, and trust.
9. Overreliance. Failing to critically assess LLM outputs can lead to compromised decision-making, security vulnerabilities, and legal liabilities.
10. Model theft. Unauthorized access to proprietary large language models risks theft, competitive advantage, and dissemination of sensitive information.
The good news is there’s no need to reinvent the wheel as far as managing the cybersecurity risks when using LLMs. Most precautionary and security measures are tried-and-tested best practice security tips: anonymization/obfuscation, encryption, access control with authentication, and authorization. They just need updating and tweaking for the AI world.