A Brief Generative AI Primer
Generative AI is a type of AI system that, at its most basic level, takes input from the user, analyzes that input, and creates an output based on its analysis. These inputs by the user are often referred to as prompts, and the interplay between the user and the generative AI tool is sometimes referred to as “prompt engineering.” This input/output process is not unlike how we have seen other AI tools work, such as machine learning, which is the AI system from which technology-assisted review was developed to categorize documents in litigation and discovery. In machine learning, that output categorizes documents with a relevancy score or rank that is used to classify documents as relevant or not relevant.
Generative AI takes that output one step further. Instead of predicting relevancy rank, generative AI will create a new piece of content based on the input provided by the user. To put it differently, and in the context of discovery, instead of using AI to prioritize documents that are most relevant, a generative AI tool could be used to provide a summary of documents identified for its analysis.
Generative AI tools create text, images, and other outputs based on prompts using large language models trained on large sets of data. For example, the datasets used to train ChatGPT 3 included text from the following sources: (1) Common Crawl, containing petabytes of data collected from the internet since 2008; (2) WebText2, text from webpages with highly ranked Reddit links; (3) Books 1 and Books 2, two internet-based book repositories; and (4) Wikipedia.
ChatGPT and Google’s Bard are examples of text-based generative AI tools, while Midjourney, Dall-E, and Stable Diffusion are examples of imaged-based tools, in which text inputs are used to create images as the generative AI output. While this article focuses primarily on the impact of text- and image-based tools, there are hundreds of generative AI tools that have been and are being developed for a variety of different outputs (e.g., presentations, websites, video content). The development of these tools is expected to become more sophisticated and likely to become integrated with tools we already use in our everyday lives.
An aspect of generative AI tools that makes them so accessible to the broader public is that they use natural language processing. No specialized script or coding language expertise is required to use these tools; instead, users can engage with them in a conversational way to achieve their desired output, making them more user-friendly. However, while these tools are increasingly user-friendly, it is important for users to anticipate the input/output process as an iterative process. For example, the first output provided by the generative AI tool may not be representative of the full context of the input provided by the user. Similar to the iterative process we see in the use of machine learning in technology-assisted review, this system requires multiple adjustments from the first output to the next in order to maximize the quality of the final output. In some cases, where generative AI tools are being used for a nuanced subject matter, a subject matter expert should be involved as a “human-in-the-loop” to validate the results.
We all have engaged with generative AI in some form for many years. Think of any web-based application that automatically predicts the next words you’ll type, such as your email application or a search browser bar. Those are early examples of generative AI, where the system is predicting your next word. Now, these AI systems are “predicting” more robust content. The accessibility and scalability that generative AI tools provide, and the wide variety of use cases for it across all business types, highlight how these technologies may truly transform the way we prepare content and conduct business.
Generative AI’s Limitations
With great power comes great responsibility, and nowhere is that more true than with generative AI. The ease by which users can access some of these tools, with free online access to OpenAI’s basic-level account option, creates some risk that users may not fully appreciate the limitations of these tools. Currently, when a user receives an output from ChatGPT, there is no sort of flag, rank, or indication to reflect how accurate that output might be. The onus is on the user to validate those results.
Much attention has been given to the tendency for GPT systems to “hallucinate” results or provide results that are inaccurate or irrelevant. There are many reasons why a generative AI tool might hallucinate: The system might not be sufficiently trained on the subject matter, it may not fully understand the context of the input, or there may be issues with the training data. As previously mentioned, depending on the intended use of the generative AI tool and the specialized area of knowledge, it may be necessary for a subject matter expert to validate the results.
Concerns about hallucinations are valid for all users but are especially vital for legal professionals given our varied and nuanced subject matter topics, often with their own specialized lexicons. There have already been examples of attorneys using generative AI without a full realization of the limitations associated with such tools, where briefs were written by ChatGPT with full case citations hallucinated and not validated prior to filing. The poster child for this case is the now infamous Mata v. Avianca, Inc., which ultimately resulted in sanctions for the attorney in question. Steven Schwartz, the attorney who used ChatGPT to write a brief, said in an affidavit to the court,
I was under the erroneous impression that ChatGPT was a type of search engine, not a piece of technology designed to converse with its users. I simply had no idea that ChatGPT was capable of fabricating entire case citations or judicial opinions, especially in a manner that appeared authentic.
Schwartz’s failure to understand the limitations of a tool like ChatGPT and the resulting sanctions highlight the importance of the legal profession taking the time to educate themselves on generative AI tools.
Issues surrounding confidentiality and privilege are also vitally important for legal professionals to consider. Not only do these generative AI systems train on data in the underlying large language models, but these systems may also continue to learn, or train, based on the inputs provided by the user. For example, whether OpenAI’s ChatGPT trains on the user’s inputs depends on the type of license that the user has for their account (e.g., Free, Plus, or Enterprise). Prior to using any generative AI tool for privileged and/or confidential work product, attorneys must have a full understanding of the terms and conditions surrounding the tools that they are using.
Emerging Technologies, Emerging Issues
Focus on AI Ethics
The expanded availability and use of increasingly powerful AI tools have made the topic of AI ethics more commonplace. A baseline definition for AI ethics can be summarized as “a broad collection of considerations for responsible AI that combines safety, security, human concerns, and environmental considerations.” Not only are we seeing governing bodies initiate proposals for AI acts regarding the regulation of AI systems based on varying degrees of risk, but the White House has also published an “AI Bill of Rights” blueprint to identify key principles to act as a guide when using AI systems. Those key principles include (1) safe and effective systems, (2) algorithmic discrimination protections, (3) data privacy, (4) notice and explanation, and (5) human alternatives, consideration, and fallback.
Many of these principles are aligned with existing legal model rules of professional conduct and are worth the judiciary and attorneys becoming familiar with as they begin to integrate generative AI tools into their practice. Not only will it be necessary for their own practice, but as discussions related to AI ethics expand and we see more regulations regarding the use of AI and generative AI systems, the judiciary and attorneys will need to establish a baseline knowledge to advise clients regarding their use of these tools and oversee matters that relate to their use.
Availability of Generative AI Tools
Less than a year after OpenAI’s November 2022 announcement, we began to see other software providers make their own announcements about the development of generative AI tools in their own tech stack. Microsoft announced in March 2023 that it would start integrating generative AI into its existing applications under the product name Microsoft Copilot. Conceptually, think of it as ChatGPT embedded into your Microsoft environment, including Outlook, Word, Excel, and PowerPoint. Copilot could be used to create a Word document summarizing a select set of documents or emails from your inbox.
The focus on generative AI is not limited to broader software applications. Existing and emerging companies in legal tech are announcing generative AI integrations with regularity. From e-discovery to contract management to legal brief drafting, emerging legal-specific tools are becoming increasingly available to suit the nuanced needs of legal professionals. The days of generative AI–created content will become more commonplace as tools like this become more readily available and integrated into existing enterprise systems.
Anticipated Impact on the Judiciary
While Mata v. Avianca, Inc. has served as an initial example of how the judiciary may have to address issues related to generative AI, it is far from being the last. Shortly after the use of ChatGPT came to light in the Mata matter, some judges decided to take a proactive approach to the potential use of generative AI. Judge Brantley Starr, Federal District Judge in the Northern District of Texas, was the first to announce an update to his individual chambers rule requiring parties to file a certification regarding the use (or nonuse) of generative AI tools. If a generative AI tool is used to assist with any portion of the filing, then certification is required to affirmatively state that the output was validated as accurate by a human being.
In addition to the use of generative AI in the practice of law, it is likely judges will have to address how generative AI outputs will be authenticated as evidence. This not only includes authorized use of generative AI in the normal course of business but likely will include the potential for evidence that has been created with deepfake technology. The role of digital forensic experts will likely become increasingly common in litigation as the availability of generative AI technology becomes more widely available. Other areas of litigation, including electronically stored information protocols and discussions around discoverability, will also likely expand to address issues related to generative AI. As the judiciary begins to experience the impact of generative AI in their courtrooms, they can look to resources, such as court-appointed neutrals, to help them navigate these emerging issues.
Generative AI’s Impact on Legal Ethics
ABA Model Rules
As the legal profession considers how generative AI tools will impact the varying areas of law, so too must it consider the impact of various legal ethics rules when using these tools in their own practice. Looking at the ABA Model Rules of Professional Conduct, attorneys should note the following rules related to competency, confidentiality, communications, and responsibilities.
- Competency—ABA Model Rule 1.1 states that “[a] lawyer shall provide competent representation to a client. Competent representation requires the legal knowledge, skill, thoroughness, and preparation reasonably necessary for the representation.” Comment [8] of the same rule requires lawyers to “keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology.” Now, with the increasing use of generative AI tools by our clients and in the legal practice, it is important for attorneys to prioritize education and training around the use of these tools. All attorneys should make an effort to become informed as to what these tools are, what they can do, their limitations, and ethical considerations around their use.
- Confidentiality—ABA Model Rule 1.6 states that “[a] lawyer shall not reveal information relating to the representation of a client unless the client gives informed consent” and Comment [4] further states that “[t]his prohibition also applies to disclosures by a lawyer that do not in themselves reveal protected information but could reasonably lead to the discovery of such information by a third person.” Prior to using any generative AI tool, lawyers should properly assess the terms and conditions associated with each tool and determine how data associated with the use of that tool are being used and what sort of access the software provider has to the attorneys’ work product (e.g., inputs/outputs).
- Communications—ABA Model Rule 1.4(a) requires lawyers to “promptly inform the client of any decision or circumstance with respect to which the client’s informed consent . . . is required” and ABA Model Rule 4.1(a) states that “[i]n the course of representing a client a lawyer shall not knowingly make a false statement of material fact or law to a third person.” Transparency is a key principle of many AI ethics proposals being drafted regarding the broader use of AI tools. For lawyers, transparency with clients regarding the use of generative AI tools will be equally important and should be aligned with the agreed-upon terms of engagement between lawyer and client.
- Responsibilities—ABA Model Rule 5.1 requires partners or lawyers with comparable managerial authority in a law firm to “make reasonable efforts to ensure that the firm has in effect measures giving reasonable assurance that all lawyers in the firm conform to the Rules of Professional Conduct.” ABA Model Rule 5.3 extends that obligation to the management of nonlawyers associated with a law firm, requiring partners or lawyers with comparable managerial authority to “make reasonable efforts to ensure that the firm has in effect measures giving reasonable assurance that the person’s conduct is compatible with the professional obligations of the lawyer.” As stated above in relation to competency, leadership in law firms, legal departments, and chambers should ensure proper training and education are conducted prior to any use of generative AI tools. Clear communications regarding approved tools, development of standard operating procedures regarding their use, and policies around the use of generative AI tools all are areas of leadership and guidance partners and lawyers with managerial oversight should consider.
Law.MIT.Edu Generative AI Task Force
The MIT Computational Law Report (law.MIT.edu) is an online publication focused on publishing peer-reviewed articles related to topics at the intersection of law and computation. The law.MIT.edu publication formed a Task Force on Responsible Use of Generative AI for Law, largely inspired by the Mata v. Avianca, Inc. matter and the attorney’s reliance on ChatGPT without a full understanding of how to properly use the tool. The primary goal of this Task Force is to establish principles for best practices when using generative AI tools. These principles look to already-existing model rules, including the ABA Model Rules of Professional Conduct and the United Nations Principles for the Role of Lawyers, and expand on them so they more directly address issues surrounding the use of generative AI tools.
The law.MIT.edu Task Force Version 0.2, June 2, 2023, draft guidelines include the following principles:
- Duty of Confidentiality to the client in all usage of AI applications;
- Duty of Fiduciary Care to the client in all usage of AI applications;
- Duty of Client Notice and Consent to the client in all usage of AI applications;
- Duty of Competence in the usage and understanding of AI applications;
- Duty of Fiduciary Loyalty to the client in all usage of AI applications;
- Duty of Regulatory Compliance and respect for the rights of third parties, applicable to the usage of AI applications in your jurisdiction(s);
- Duty of Accountability and Supervision to maintain human oversight over all usage and outputs of AI applications.
To further expand on these principles, the law.MIT.edu Task Force provides the following examples of how these principles may be applied: