chevron-down Created with Sketch Beta.


What Is the Future of Legal Artificial Intelligence?

Michael Andrew Iseri


  • This article provides an overview of AI to better equip readers with the ability to understand and characterize the different AI programs out in the real world.
  • Three AI components—human interfaces, intelligent automation, and machine learning—create different levels of AI complexities in the real world. 
  • Different types of biases could “corrupt” an AI program during its development and implementation stages and could impact development of a legal AI.
What Is the Future of Legal Artificial Intelligence?
SweetBunFactory via Getty Images

The legal field—like other professions—is undergoing a transformative phase that would integrate more advanced technology into its legal services. Technology adoption rates have accelerated in 2020 due to COVID-19 restrictions for in-person meetings and legal hearings. As at a trial with inadmissible evidence, the door is now wide open to bring in technology.

This article serves as a primer on the current state of artificial intelligences (AIs) and their application to the legal field. To note, there are few resources that clearly define legal technologies—especially legal AIs—without misleading marketing terms, grandiose claims or gimmicks, or incompatible real-world applications. Most importantly, there are different classes and case scenarios of AIs existing in the real world, such as search engine AIs, content generation AIs, navigation AIs (such as self-driving cars), auto-response AIs, and much more. The knowledge in this article is based on the author’s perspective as an attorney and software engineer. (The author has personally developed numerous legal AIs from scratch in more than 45 fully voiced languages that dynamically complete legal services, as well as fully voiced homeless/COVID-19 resource map systems for California, fully voiced Constitution and Miranda rights programs, and legal guides on the Google Play Store. The author also had automated fully voiced bar exam flash card study programs, but they were decommissioned when the California Supreme Court appointed the author to the California Committee of Bar Examiners to create the July/February California bar exams for a four-year term.) The information has also been vetted through numerous dialogues with various software engineers from Google and Uber in San Francisco and Silicon Valley. In the technology and programming fields, the best sources of information are local meetups because technology moves extremely fast, faster than writing and publishing articles at times. It was easier to share and disseminate new technology concepts and best practices at pre-COVID meetups.

First, this article provides an overview of the three components of an AI to better equip readers with the ability to understand and characterize the different AI programs out in the real world. Second, there is a brief discussion on how the three AI components create the different levels of AI complexities in the real world. Last, this article provides an overview of the different types of biases that could “corrupt” an AI program during its development and implementation stages and that would likely impact any development of a legal AI.

The Three AI Components: Human Interfaces, Intelligent Automation, and Machine Learning

In the programming world, there is no such thing as a “true AI,” a program that codes itself to evolve and adapt. The stuff in Hollywood and films has tainted the populace’s perceptions of what AIs truly are for numerous professions.

There are three main components of AIs that can characterize an AI program in different professions:

Human Interface

This component is the main means by which an AI program communicates with humans, whether through sight, sound, touch (haptic feedback), or other means. Without this component, a program would not be able to receive input from or communicate back to humans on its operations. Main examples are dialogue boxes and webpages, a chatbot, voice interfaces such Amazon’s Alexa and Apple’s Siri, vibrations, and sirens and alarms.

Intelligent Automation (IA) Tools

This component essentially defines an AI program’s identity and core functions by establishing its tools and operations. IA tools are coded instructions that provide the necessary means for an AI program to do what it is programmed to do. It is analogous to the tools that a human would use to accomplish a specific goal, such as using a saw and hammer to build a table. Most importantly, these IA tools have been programmed by humans, and no AI programs have been able to truly build their own IA tools outside of a controlled environment. Currently, IA tools are the limiting factors for AI programs to evolve because they require humans to program new parameters and functions. For example, a human can easily play a game of chess or go that includes extra rows and columns in addition to those of a conventional gameboard; however, an AI program would not be able to understand these extra outside rules without being programmed to anticipate that possibility if it is not within the parameters of its existing IA tools.

Machine Learning (ML)

This component allows a programmer to have an AI a program to fluctuate its own parameters to optimize itself or to find alternate solutions. The main benefit of ML is that it allows finer optimization at superior speeds of development by removing the programmer’s need to further fine-tune a program. For example, an AI program that identifies a particular fruit through images can do so without a programmer needing to continuously refine the parameters of that fruit at different angles and lighting. This is often accomplished by “feeding” the AI program libraries of existing images or information, such as giving an image AI program numerous different pictures of a single fruit for the program to refine its parameters. In addition, the better ML tools would incorporate multiple layers of checks that conduct different analysis and judgment protocols on a particular task before coming to a consensus and a conclusion. Imagine this as numerous panels of appellate judges at different levels with different backgrounds trying to decide the outcome of a single case. To note, some people have misconstrued ML tools as an AI itself because they appear to perform the other two AI components (human interface and IA tools). This is a misconception because MLs just optimize and refine the other two AI components. At present, MLs cannot truly create their own IA tools outside controlled environments. An example would be a user asking a voice app the outside temperate in the mornings. The human interface receives the request and replies via voice while the IA tools check an online database for the outside temperature. If built into the program, the ML tools would provide a more custom human interface response to the user (such as using the user’s name and making the response shorter) and could already run the IA tools for temperatures based on the user’s likely location in anticipation of that request.

A Brief Overview of the Levels of an AI—“Simple AIs,” “Sophisticated AIs,” and “True AIs”

When only two or three AI components are present, then you have a “simple AI.” Contrary to its name, a “simple AI” can be extremely complex, and simple AIs are often extremely efficient at accomplishing what they need to accomplish. From the author’s experience, only the human interface and IA tools are necessary components for AI programs to operate in professions, especially in the legal profession. Although there are numerous definitions of AIs that you can look up, often the appearance of a program performing a complex or redundant task quickly demonstrates “intelligence.” The ML component is not necessary when an AI program does not need to refine itself after numerous uses or the AI program is easily adjustable by a programmer (which is a different topic on technology sustainability and deprecation). Examples of “simple AIs” in the legal profession  are basic document automatization programs/websites and e-discovery searching tools that try to find patterns based on inputted search terms.

A “sophisticated AI” is when you have multiple AI components or numerous “simple AIs” working in conjunction to accomplish numerous tasks. The main difference between a “simple AI” and a “sophisticated AI” is not the complexity (though it could be a factor), but rather the enormity of separate AI components working alongside each other to accomplish an AI program task.

A “true AI” is when you have a program that creates its own IA tools without any humans programming it to learn these IA tools. To the best of the author’s knowledge, no “true AIs” exist outside of controlled environments that had humans already guiding the programs’ development. For example, you and I can try to play a new musical instrument tomorrow if we wanted to on a whim; however, an AI program would not have the ability to do so unless it is within its existing IA tools parameters to learn a new musical instrument. A program cannot “try” unless preprogrammed to do so. Until an AI program can learn a new skill on its own accord without any human intervention or guidance, then a “true AI” is just a tale best told in cinema.

An Overview of the Program’s Bias Problem—Programmer’s Bias, Data Bias, and Application Bias

Creating AI programs for the legal field could be problematic due to the “Program’s Bias Problem.” Programs have multiple stages of development into which biases could be introduced and impact an AI program. In current diversity and inclusion research, the Program’s Bias Problem is like “implicit biases” (a belief that there is an unconscious bias affecting value judgments that is manifested in every individual). It is the author’s belief that in the programming world, biases show up at three stages of a program’s life cycle: (1) programmer’s bias, (2) data bias, and (3) application bias.

The first stage of biases is introduced at the development and programming level of a program. A company’s development committee and its programmers must make binary cutoffs throughout various parts of a program’s code for the code to function. Often, in programming, the program’s responses and observances are reflected as binary inputs/outputs of strictly 1s or 0s (e.g., on/off; yes/no; white/back). Even when variance decision-making is implemented (e.g., shades of gray), there are cutoff points or thresholds in a program at the code level such as 50/50 cutoffs. The cutoff points are often made by the development committee or programmers (or both) implicitly or through ML algorithms that adjust those thresholds up and down; and these cutoff points are based on the biases of the development committee or programmers to accomplish their desired goals for an AI program.

The second stage is the data bias. For programs to function correctly with their IA tools and machine learning, they must be “fed” vast amounts of data. The source of the data can be biased, and the biased data would make the program biased. An example is a college implementing an admissions acceptance AI program that would accept the best of the best candidates for the next school year. If data are used from the college’s hundred-plus years of history, especially during the pre–civil rights era, then the program would likely incorporate racist biases in its application processes. Oscar Schwartz, “Untold History of AI: Algorithmic Bias Was Born in the 1980s,” Tech Talk (IEEE), Apr. 15, 2019.

The third stage is application bias. This stage is how the program is used in the real world and how it could affect the overall biases for the other two stages. The best way to describe this bias is through an example. Imagine that you have the best program to detect drug usage. Amazing, right? However, what if that drug detection program is used only on specific groups, such as Hispanics and African Americans, at traffic stops? The use of the program in and of itself creates a bias (being used only at traffic stops and only on selected groups), and this bias would have an unintended consequential loop that affects programmer’s bias and data bias.

The three stages of bias have actual real-world consequences. A popular story shared among diversity and inclusion advisors discusses an example using image search AI programs distinguishing images of chihuahuas’ faces and blueberry muffins. Mariya Yao, “Chihuahua or muffin? My search for the best computer vision API,” freeCodeCamp, Oct. 12, 2017. The real-world controversial problem is actually a popular image app by Google in 2015 that tagged images of an African American couple with the photo tags of “Gorilla.” Loren Grush, “Google engineer apologizes after Photos app tags two black people as gorillas,” Verge, July 1, 2015. After three years, Google “fixed” the problem by just removing the label “Gorilla” so no images would be labeled “Gorilla.” James Vincent, “Google ‘fixed’ its racist algorithm by removing gorillas from its image-labeling tech,” Verge, Jan. 12, 2018. This is a prime example of  a lack of oversight in developing the AI program to be better at its image recognition, not testing enough various data for better AI program development, and not testing the program in its application before its deployment. What resulted is an AI image search program that has underlying racist problems from program’s bias.

Another example involves the recent firing/resignation of a prominent African American Google AI scholar, Timnit Gebru, for a soon-to-be-published paper on the risks involving large-scale human language processing AI programs. Matt O’Brien, “Google AI researcher’s exit sparks ethics, bias concerns,” Associated Press, Dec. 4, 2020. Timnit Gebru warned that this large-scale AI could cause smaller and more nuanced diction and linguistic cultural developments to be drowned out by a larger and more text-vocal majority. Karen Hao, “We read the paper that forced Timnit Gebru out of Google. Here’s what it says,” MIT Tech. Rev., Dec. 4, 2020. MIT Technology Review describes one of the major conclusions of the paper as the following:

It [large-scale language processing AI programs] will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. The result is that AI-generated language will be homogenized, reflecting the practices of the richest countries and communities.

Gebru also highlights other problems such as the massive energy costs to train such an AI program resulting from its carbon footprint and electricity, the fact that an actual AI program would not understand human language rather than manipulating data to reflect that the AI program understands human language, and the potential use of this AI program to generate misinformation through an illusion of meaning if successful, according to the same MIT Technology Review article. Gebru’s paper and warning further highlight the importance of oversight in developing any AI program at its inception with diverse perspectives, fully understanding the different problems with existing and future data sources, and the outcomes the AI program could produce and further perpetuate biases in its application.


Considering that legal AIs would require using sources that are often not the best sources of diversity and perspectives, the future of legal AIs appears to be bleak in terms of their actual unbiased development and application. One of the biggest problems for future legal AI programs would be based on numerous statistical findings of gender, race, LGBT+, and disability representation at law firms and courts. For example, representation of attorneys with disabilities in law firms was at 0.54 percent in 2017, compared with a U.S. Census Bureau data report that shows about 20 percent of the general population had a disability in 2010. Angela Morris, “Are law firms committed to disability diversity? A handful of firms have taken action,” ABA J., Oct. 24, 2018. The future of legal AIs and their problems will be more prevalent as the legal profession embraces technology in the post-COVID world.