Recent years have seen a dramatic increase in awareness by the public and the legal community of the growing role of artificial intelligence (AI) and other computational techniques in the legal system and in legal practice. For example, several articles in the Fall 2017 issue of The SciTech Lawyer expressed fears that the rise of AI threatens to reduce legal jobs and to substitute unaccountable and opaque algorithmic processes for human judgment and due process.
As a researcher in the field of AI and law since the late 1980s, I am happy to see increased interest in AI on the part of the legal community. However, my perception is that a distorted view of the accomplishments, capabilities, and promises of AI has emerged in popular perception. This article will briefly review the history of AI and law research, the focus of current research, and my view of the most important future directions. I will discuss the perceived dangers of AI in the law in light of these observations.
First Generation AI and Law
The term “artificial intelligence” was coined by John McCarthy in a storied 1956 Dartmouth Symposium.1 The primary focus of early AI research was logical reasoning, e.g., proving theorems of logic,2 playing checkers,3 and deducing the molecular structure of chemical samples.4 To the pioneers of AI and law, the logical character of legal rules and arguments suggested that these new computational techniques for logical inference would be well suited for automated legal reasoning. The ability of logic-based systems to provide explanations for their conclusions by displaying the sequence of logical steps justifying the conclusion was a particular strength of this approach.5
Early prototypes were developed in tax,6 torts,7 and immigration8 that applied legal rules represented in the form of computable logic to facts (also represented in logical form) to determine whether a given legal conclusion followed from the facts. Early successes in this approach led to optimism that the transformation of the law by automated systems was imminent. It soon became apparent, however, that there were several major impediments to widespread adoption of logic-based legal systems, including (1) the difficulty of efficiently and verifiably representing legal texts in the form of logical rules; (2) the gap between legal terms and the language of ordinary discourse, such as a client’s description of a set of facts; and (3) institutional resistance to the introduction of AI into the legal workplace.9
The first step in representing legal texts as logical rules is to identify the logical structure of the legal texts. However, there are often multiple plausible logical interpretations of even a carefully drafted legal text. Moreover, the institutional imprimatur of a validly enacted legal text does not extend to a formal logical representation of that text. So, unless a legislature or other law-enacting institution actually enacts a legal rule in logical form, as opposed to a text, any given logical representation is necessarily simply one interpretation of the legal text among many.
In addition to these theoretical issues, representing large, complex legal texts in formal logic presents significant practical challenges. Creating logical representations of legal texts is a highly specialized skill, requiring insight into the subtleties of both legal writing and computational logic. It is difficult to find people with both skills, and the process can be labor-intensive and error-prone and is hard to scale to large texts.
On the other hand, these representation issues do not preclude systems for relatively common, routine situations involving relative stable law in which the objective is to draw prima facie legal conclusions without consideration of all the possible subtleties and exceptions that a skillful lawyer might raise.10 TurboTax typifies this situation, in which the effort of formalizing a set of rules is amortized across many users, major changes in rules typically occur only annually, and the system handles only routine solutions to routine situations without attempting to duplicate the skill of a tax attorney.
A second challenge to the logic-based approach is the “gap” between the technical terms occurring in legal rules and the language of ordinary discourse. Much of the skill of an attorney is recognizing what legal concepts might be applicable to a real-world situation, e.g., that a “barking dog” might be a “nuisance,” that the effect of a “forged check” might depend on the law of “negotiable instruments,” that believing that one might be hit by a thrown rock might be a “apprehension of harm,” etc. In addition, legal concepts are characterized by “open texture,” e.g., vagueness (“manifestation of assent”), variable standards (“due care,” “a reasonable time”), defeasibility (the rule against motor vehicles in the park may not apply to ambulances), and ambiguity (“my son” used in the will of a person with two sons).11 These complexities are not limited to legal concepts,12 but logic-based legal reasoning systems nevertheless require some approach to assessing legal terms that function as predicates in logical expressions.
Starting in the late 1980s, the dominant approach to the language gap and open texture was case-based reasoning (CBR), which involves evaluating whether a legal concept is satisfied under a given set of facts by comparing those facts to exemplars for which the truth value of the predicate is known, either as a matter of institutional imprimatur, as with legal precedents, or through consensus, as with prototypical situations.13 Fact situations that resemble both positive and negative instances of a legal predicate give rise to competing analyses, and differences in degrees of similarity give rise to a natural ordering of the persuasiveness of case-based arguments. Under this approach, the objective of a legal system was not giving a definite answer, but rather showing the strongest arguments for and against a given legal conclusion in light of the similarity between the facts of the instant case and situations with known legal outcomes. Several approaches to integrating logic-based approaches with CBR were prototyped, with some capable of producing analyses similar to law students.14 However, all the CBR systems took as input not natural language descriptions of case facts, but rather formal representations painstakingly engineered to facilitate computational comparison between cases. These CBR systems therefore have the same scalability problems as arise in formalizing legal rules: there are insufficient human resources to keep up with the flood of new cases.
An alternative approach to CBR has focused on logical formalisms designed to model dialectical rule-based arguments. For example, two sets of rules might each support a different outcome under a given set of facts, and argumentation in this case might involve priorities over the policies represented by each set of rules.15
The objective of both the CBR and the dialectical rule-based systems was less to predict the outcome of a given case than to generate the arguments that an attorney might make, and could understand, for or against a given legal conclusion. This was motivated by the observation that conclusions unsupported by cogent arguments are of little value in most legal contexts. In this respect, AI systems in law resembled those in other expert fields, such as medicine, in which a computer’s advice would probably be ignored unless it was accompanied by an explanation. The focus on argumentation led to an extensive literature on the nature of legal argumentation, demonstrating how formal models of rules and precedents could be used in a dialectic process by opposing parties to support and rebut claims.16 Because of the difficulty of scaling these systems, however, they tended to be limited to small, hand-constructed examples involving at most a few dozen cases and rules.
Second Generation AI and Law
In recent years, advances in both human language technology and techniques for large-scale data analysis (big data) have vastly increased capabilities for automated interpretation of legal text. This has given rise to data-centric approaches to legal problem solving that can, under some circumstances, finesse the two key impediments to logic-based systems identified above: rule formalization and the language gap. However, data-centric techniques typically address different legal tasks than logic-based approaches; in general, they are well suited for tasks that involve exploiting of knowledge latent in legal document collections or databases or that require empirical or statistical characterizations of cases, but current data-centric techniques are poorly suited to tasks involving generation or analysis of legal discourse and have little explanation capability.17
The best known application of data-centric techniques to the law is prediction of case outcomes, such as duration, costs, potential awards or punishments, and the probability of success of claims, motions, or other pleadings. There is a long history of proprietary prediction systems in the insurance industry, e.g., for estimating the settlement value of a tort claim.18 An area of particularly active current commercial activity is litigation assistance, that is, providing predictions to assist with strategic litigation decisions.19 Perhaps surprisingly, these litigation assistance tools typically base outcome predictions not on the merits of the case, but rather on ancillary factors, such as the judge, attorneys, court, cause of action, etc. Although unrelated to the merits of the case, these factors are often surprisingly predictive, e.g., of the probability of success of a given motion or the relative effectiveness of different attorneys before a given judge.20
Several recent projects have sought to predict the outcome of appellate cases by applying machine learning algorithms to hand-engineered representations of cases. This approach has led to surprisingly accurate predictions of the outcome of U.S. Supreme Court21 and Canadian Tax Court decisions.22 However, these results are somewhat misleading in that the predictions are only as good as the case representations upon which they are based. As with the rule-based approach, the time and expertise required to devise a representation for each case makes it impractical to scale up this approach.
More recent research has focused on case outcome prediction from the text of cases without a manual representation step. In these approaches, machine learning algorithms learn to predict case outcomes based on word and phrase statistics in previous cases, without an effort to model the underlying legal rules or arguments. This approach has shown some promise for routine cases with a limited number of possible outcomes and relatively predictable fact patterns, such as occur in low-level administrative decisions.23
Predictive ability is related to certain aspects of human expertise. For example, an attorney’s advice about whether to go to trial on a certain claim is typically based on a combination of the merits of the case, e.g., the relative strengths of alternative competing legal arguments, and more statistical factors such as the past behavior of judges and other participants in the legal disputes, based on the attorney’s experience. Current predictive litigation assistance tools neglect the first set of factors—the relative strength of competing arguments—focusing exclusively on the latter—generalizations based on past events.
Besides using different methods, systems for predicting case events differ in their objectives from the earlier logic-based and CBR systems, whose purpose was to produce cogent legal arguments based on legal rules and precedents. As a result, the AI and law community has in general been quite resistant to the idea that predictive systems are AI at all, viewing these techniques instead as shallow statistical methods devoid of legal knowledge and incapable of providing any insights into legal systems. However, these reservations about predictive systems have been overtaken by the legal tech explosion.
The Stanford Center for Legal Informatics currently lists nearly 800 legal tech companies whose applications span from online dispute resolution, practice management, and document automation to legal research and education, e-discovery, and analytics.24 In most cases, the technology underlying these companies can only loosely be characterized as AI, but instead reflects the diverse ways that data analytics and user-focused interfaces can help automate routine aspects of legal practice and improve the experience of attorneys and citizens in addressing legal problems. Very few of the technologies involve inference over legal rules or generation of legal arguments, and few focus on providing explanations or justifications for legal recommendations.
Among the most troublesome examples of opacity are risk assessment systems, such as the COMPAS system used by courts and parole boards to forecast future criminal behavior. These are statistical models, not AI systems, but they are susceptible to several types of bias. First, systems that are trained on data (e.g., descriptions of prior defendants’ relevant features and subsequent offenses), as opposed to explicit rules, are vulnerable to selection bias, that is, such models are accurate as to subsequent defendants only to the extent that the defendants on which the system is trained are truly representative of the defendants whose risk is to be estimated. If, for example, the original pool of defendants is the result of biased policing, then the resulting model will reflect this underlying bias. Second, the model itself might embody biases. Proprietary code is unavailable for inspection and challenge, and research triggered by the ProPublica investigation of the COMPAS system25 has clarified that “fairness” actually refers to several distinct and incompatible concepts, e.g., identical likelihood, regardless of race, that a defendant flagged as high risk will in fact reoffend vs. identical likelihood, regardless of race, that non-reoffenders will be flagged as high risk.26 Without access to the underlying code, it is difficult for a defendant to challenge which version of “fairness,” if any, the algorithm attempts to implement or whether the algorithm correctly implements the model that it purports to use. Concerned about delegation of the power to adjudicate individual rights to opaque and unaccountable algorithmic systems, the European Union adopted a “General Data Protection Regulation,” which appears to guarantee an explanation for algorithmic assessment of rights,27 but there seems to be little appetite in the United States for similar institutional explanation guarantees.28
In general, computer scientists have a very strong preference for publication of technical details and open-source software and data collections. This is because the ability to assess the significance of a given piece of research using community-wide evaluation metrics and data sets is a powerful driver for technical progress. Inability to view the methods, and reproduce the results, of proprietary systems leads to opacity whose source is institutional rather than algorithmic. For example, the field of e-discovery largely lacks the community-wide evaluation standards and data sets that have led to steady advancement in the broader information retrieval research community.29
In summary, current data-centric approaches typically address predictive tasks and routine activities related to legal practice that do not involve modeling human legal expertise or generating arguments or justifications recognizable as attorney work products. These models are typically opaque, but some of this opacity arises from institutional tolerance of proprietary systems and the absence of community-wide evaluation criteria and data sets that would open these algorithms to public inspection.
Third Generation AI and Law
I will conclude by discussing some trends that I anticipate will characterize AI and law in the near future: systems with increased explanatory capability, enactment of legal rules in executable form, legal discourse generation, and decision support systems.
Reducing the opacity of the machine learning algorithms that perform best for many purposes—neural network models, often referred to as deep learning—is the focus of very active current research. For example, DARPA, the agency responsible for the Internet, autonomous vehicles, and a wide range of other technical innovations, is funding a program in “explainable AI” (XAI).30 Previous XAI work has fallen into two categories: efforts to develop neural networks that can identify the aspects of an input that are responsible for the output, and adding a separate post hoc justification step. Exemplifying the former are attention networks, neural networks that learn the most significant parts of input data.31 After processing each instance, the relative importance of the input can be determined from the network, e.g., the portions of an image (ears, whiskers) that played the biggest role in classifying the image (“cat”), or the portions of a case description (“barking dog”) most relevant to a legal conclusion (“nuisance”). An example of the latter is the Split-Up system for division of marital property in a divorce, which uses a neural network trained on previous cases to predict a division and a separate rule-based module to justify the resulting division.32 This two-stage process has a parallel in the distinction between the “logic of discovery” and the “logic of justification” widely recognized in the philosophy of science,33 but some might argue that due process issues are inevitable in any system whose explanation is only a post hoc rationalization of the actual decision process.
The law differs from most areas of human expertise in that legal justifications must, in general, be based on legal rules, expressed in text, whose validity depends on enactment or recognition by some official body. As mentioned above, modeling such rules in computer executable form is problematic because any given logical representation is simply one possible interpretation of the official text. A solution to this problem is for rule-enacting bodies (e.g., legislatures or regulatory bodies) to enact rules in logical, computer executable form. By producing officially sanctioned rules, this could open the door to improved legislative and regulatory consistency (e.g., logical evaluation of consistency with existing rules or automated evaluation with prototype cases intended to be covered by new rules); improved voluntary compliance, since executable rules could be used by members of the public to test alternative actions or situations; and improved administration, since the consistency of procedures, forms, websites, and other entities that mediate between rules and citizens could be subject to formal verification. Because the idea of enactment of rules in computer executable form is unfamiliar to rulemaking bodies, experiments with this approach are likely to come only when there are well-publicized projects that demonstrate the benefits of automated interpretation of legal rules. Smart contracts, algorithmic implementation of transaction rules popular in the context of blockchain-based cryptocurrencies, may play a role in popularizing the idea of computational implementation of legal rules.34
First generation AI typically sought to produce not just conclusions, but also the strongest arguments for and against a legal conclusion. However, dependence on manual representation of rules, cases, and facts prevented these techniques from scaling beyond the demonstration stage. I anticipate that the goal of generating legal discourse, including arguments, justifications, memoranda, and routine motions and orders, will reemerge as a central goal of AI and law, but that modern data-centric techniques will make these techniques less dependent on manual processes.
One approach to legal discourse generation would exploit the ability of neural networks to generate, as well as input, text. For example, neural network summarization systems input a larger text and output a concise paraphrase,35 and question answering systems input a question and output an answer.36 Such systems can be trained to produce arguments or justifications for legal conclusions, provided that sufficient training examples can be obtained.
Another line of research that may lead to improved legal discourse techniques is argumentation mining, a set of techniques for identifying arguments in text and analyzing their role in a discussion or decision.37 Finally, I expect that as techniques for analysis and generation of legal text improves, there will be increasing emphasis on decision support, that is, systems that help assemble and analyze documents relevant to a given task and assist with generating work products, based on text mining previous work products and observations of the user.38
The objective of AI and law research from its inception has been to produce understandable and persuasive analyses of legal problems. Current predictive models that have little explanatory capability are a deviation from this line of research. However, the current trend in AI is toward explanation and transparency. A key promise of the third generation hybrid of data-centric methods with discourse generation goals is that such systems will produce output that can be evaluated for validity, persuasiveness, and bias. This would make the opacity of the internal computation steps of little importance for the same reason that the opacity of a human attorney’s internal computations has little importance: because it is the validity, persuasiveness, and bias of an attorney’s work product that matters, not the internal processes that produced it.
The ultimate promise of AI and law is increased transparency of the regulatory and adjudicatory systems and improved access to justice, not by automating the analytical expertise of attorneys or encroaching on the discretionary authority of attorneys and adjudicators, but by increasing the predictability of legal institutions and the intelligibility of legal texts. u
1. James Moor, The Dartmouth College Artificial Intelligence Conference: The Next Fifty Years, 27 AI Mag., no. 4, Winter 2006, at 87.
2. Allen Newell et al., Elements of a Theory of Human Problem Solving, 65 Psychol. Rev. 151 (1958).
3. A.L. Samuel, Some Studies in Machine Learning Using the Game of Checkers, 3 IBM J. 210 (1959).
4. Bruce G. Buchanan & Edward A. Feigenbaum, DENDRAL and Meta-DENDRAL: Their Applications Dimension (Stanford Univ., Technical Paper No. STAN-CS-78-649, 1978).
5. Bruce G. Buchanan & Edward H. Shortliffe, Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project (1984).
6. L. Thorne McCarty, Reflections on Taxman: An Experiment in Artificial Intelligence and Legal Reasoning, 90 Harv. L. Rev. 837 (1977).
7. Jeffrey Alan Meldman, A Preliminary Study in Computer-Aided Legal Analysis (Aug. 29, 1975) (unpublished Ph.D. thesis, Massachusetts Institute of Technology).
8. M.J. Sergot et al., The British Nationality Act as a Logic Program, 29 Comm. ACM, no. 5, May 1986, at 370.
9. Rees W. Morrison, Market Realities of Rule-Based Software for Lawyers: Where the Rubber Meets the Road, in Proceedings of the Second International Conference on Artificial Intelligence and Law 33 (1989).
10. See, e.g., L. Karl Branting, An Advisory System for Pro Se Protection Order Applicants, 14 Int’l Rev. L. Computers & Tech. 357 (2000).
11. H.L.A. Hart, The Concept of Law (1961).
12. George Lakoff, Women, Fire, and Dangerous Things: What Categories Reveal about the Mind (1987).
13. Edwina L. Rissland et al., Case-Based Reasoning and Law, 20 Knowledge Engineering Rev. 293 (2005).
14. L. Karl Branting, Reasoning with Rules and Precedents 135–44 (2000); David B. Skalak & Edwina L. Rissland, Arguments and Cases: An Inevitable Intertwining, 1 Artificial Intelligence & L. 3 (1992).
15. Trevor Bench-Capon, Open Texture and Argumentation: What Makes an Argument Persuasive?, in Logic Programs, Norms and Action 220 (Alexander Artikis et al. eds., 2012).
16. Henry Prakken & Giovanni Sartor, Law and Logic: A Review from an Argumentation Perspective, 227 Artificial Intelligence 214 (2015).
17. For a discussion of the relationship between rule-based and data-centric approaches from a somewhat more technical perspective, see L. Karl Branting, Data-Centric and Logic-Based Models for Automated Legal Problem Solving, 25 Artificial Intelligence & L. 5 (2017).
18. Mark Peterson & D.A. Waterman, Rule-Based Models of Legal Expertise, in Computing Power and Legal Reasoning 627 (Charles Walter ed., 1985).
20. See, e.g., Mihai Surdeanu et al., Risk Analysis for Intellectual Property Litigation, in Proceedings of the 13th International Conference on Artificial Intelligence and Law 116 (2011).
21. Daniel Martin Katz et al., A General Approach for Predicting the Behavior of the Supreme Court of the United States, 12 PLoS ONE, no. 4, 2017.
22. Nikolaos Aletras et al., Predicting Judicial Decisions of the European Court of Human Rights: A Natural Language Processing Perspective, PeerJ CompSci (Oct. 24, 2016), https://peerj.com/articles/cs-93/.
23. L. Karl Branting et al., Inducing Predictive Models for Decision Support in Administrative Adjudication, Paper Presented at the MIREL 2017 Workshop on “MIning and REasoning with Legal texts” (June 16, 2017), http://www.mirelproject.eu/MIRELws@ICAIL/MIRELwsPubs/Branting-etal-MIRELwsAtICAIL.pdf.
25. Julia Angwin et al., Machine Bias, ProPublica (May 23, 2016), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
26. Jon Kleinberg et al., Inherent Trade-Offs in the Fair Determination of Risk Scores, in Proceedings of the 8th Innovations in Theoretical Computer Science Conference 43:1 (Christos H. Papadimitriou ed., 2017).
27. Bryce Goodman & Seth Flaxman, EU Regulations on Algorithmic Decision-Making and a “Right to Explanation,” 38 AI Mag., no. 3, Fall 2016, at 50.
28. See, e.g., State v. Loomis, 881 N.W.2d 749 (Wis. 2016).
29. The National Institute of Standards and Technology (NIST) had a Legal Track for evaluation of e-discovery technologies from 2006–2012 (https://trec-legal.umiacs.umd.edu), but no such competition appears to exist today.
30. David Gunning, Explainable Artificial Intelligence (XAI), DARPA, https://www.darpa.mil/program/explainable-artificial-intelligence (last visited Mar. 14, 2018).
31. Colin Raffel & Daniel P.W. Ellis, Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems, Paper Presented at the Fourth International Conference on Learning Representations (May 2–4, 2016), https://openreview.net/pdf?id=81DD7ZNyxI6O2Pl0Ul5j.
32. John Zeleznikow & Andrew Stranieri, The Split-Up System: Integrating Neural Networks and Rule-Based Reasoning in the Legal Domain, in Proceedings of the Fifth International Conference on Artificial Intelligence and Law 185 (1995).
33. Carl R. Kordig, Discovery and Justification, 45 Phil. Sci., no. 1, Mar. 1978, at 110.
34. Florian Idelberger et al., Evaluation of Logic-Based Smart Contracts for Blockchain Systems, in Rule Technologies: Research, Tools, and Applications 167 (Jose Julio Alferes et al. eds., 2016).
35. Sören Brügmann et al., Towards Content-Oriented Patent Document Processing: Intelligent Patent Analysis and Summarization, 40 World Pat. Info., Mar. 2015, at 30; Jianpeng Cheng & Mirella Lapata, Neural Summarization by Extracting Sentences and Words, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 484 (2016).
36. Jun Yin et al., Neural Generative Question Answering, in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence 2972 (2016).
37. See, e.g., Ass’n for Computational Linguistics, Proceedings of the 4th Workshop on Argument Mining (2017).
38. L. Karl Branting et al., Cognitive Assistance for Administrative Adjudication, in Ass’n for the Advancement of Artificial Intelligence (AAAI), Cognitive Assistance in Government and Public Sector Applications, AAAI Technical Report FS-17-02, at 134 (2017); Vishwas P. Pethe et al., A Specialized Expert System for Judicial Decision Support, in Proceedings of the Second International Conference on Artificial Intelligence and Law 190 (1989).