Information is the critical raw material that lawyers leverage and the most fundamental product that they deliver to their clients. Understanding the client’s position and circumstances is the foundation of lawyers’ work, and electronic information is a critical component of that understanding. In the information age, knowing how to glean useful information quickly from large amounts of disparate data to reach more accurate conclusions is what distinguishes truly great lawyers. This is not rocket science—it is data science.
January 01, 2016
Lawyering in the Information Age: Leveraging Analytics to Be a Better Attorney
What Is Data Analytics?
Data analytics is fundamentally about identifying patterns in information faster and more accurately. These patterns in data usually represent patterns in human behavior and thus can be used to piece together what happened or, in some cases, what is likely to happen. Descriptive analytics can be used to look back in time, while predictive analytics can look forward. In both temporal contexts, analytics is changing the way we approach our everyday lives and the way we practice law.
Both public and private entities are using analytics to achieve their objectives and drive engagement. Retailers like Amazon conduct data analytics on consumer behavior to make product recommendations. Netflix predicts which movies we might like, and CVS predicts which coupons might entice us to buy. Customer segmentation, which involves grouping consumers based on behavior, demographics, and psychographics, is at the heart of many companies’ engagement efforts. Customer segmentation allows businesses, such as hotels and airlines, to offer differential pricing models to optimize sales and maximize profits.
In the public sphere, colleges are using information available about prospective students to decide whether to admit them and, later, which classes, resources, and programs to offer them. Municipalities are using analytics to deliver public services such as power distribution, trash collection, and street maintenance. Each of these analytics applications is based upon patterns in human conduct. The algorithms used to understand these patterns at the consumer and citizen level are the same algorithms lawyers are employing to understand the behavior of investigative subjects.
Predictive Coding and E-Discovery
Like entities in the public and private sectors, lawyers who leverage analytics are gaining insights for, and about, their clients faster and more accurately. The most well-known form of predictive analytics in the legal space is predictive coding, which is also called computer- or technology-assisted review (CAR or TAR, respectively). Predictive coding involves manually classifying a sample of data. The characteristics of the sample data then are used to build a mathematical model that predicts the classification of the remainder of the data in the larger population. In other words, the patterns within the data are used to build a model to predict the classification of new data.
The first known federal case to validate predictive coding in the context of e-discovery was Moore v. Publicis Groupe.1 In the opinion, Magistrate Judge Andrew Peck described computer-assisted review in e-discovery as “an acceptable way to search for relevant [electronically stored information] in appropriate cases.”2 He defined “computer-assisted coding” as tools that use “sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e., training by) a human reviewer.”3 Human reviewers code a “seed set of documents,” and “[w]hen the system’s predictions and the [human] reviewer’s coding sufficiently coincide, the system has learned enough to make confident predictions for the remaining documents.”4 Moore was merely the first of many predictive coding cases.5
Predictive coding has saved lawyers countless hours and significant cost over manual linear review. But this isn’t the real power of using predictive analytics in the e-discovery space. The point of e-discovery isn’t to get through documents; it is to get to the facts within those documents, and this is where the true strategic advantage of analytics comes into play.
Other Applications of Advanced Analytics
Being the first to know what happened and why can provide a significant strategic advantage over an adversary. For example, we represented a corporate client being sued by a former employee in a whistleblower qui tam action alleging a violation of the False Claims Act. The suit represented a significant threat to the company’s reputational and financial interests. We targeted the data most likely to illuminate the facts and applied advanced analytics to 675,000 documents; within four days, before we filed the answer to the complaint, we knew with certainty that the allegations were completely baseless. Armed with a firm grasp of the facts, we asked the plaintiff’s counsel to meet, and we voluntarily produced 12,500 documents that described the parties’ positions precisely. We then met with the plaintiff’s counsel and walked them through the evidence, laying out all of the facts. The case settled within days for what amounted to nuisance value based on a retaliation claim—without any discovery and at a small fraction of the cost budgeted for the litigation. This example illustrates that the real power of advanced analytics is the strategic advantage that comes with counsel getting to an answer quickly and accurately.
Advanced analytics also can be used for investigations, either in response to a regulatory inquiry or for purely internal purposes. Corporate clients can be faced with circumstances where determining the existence of a problem, and the scope of the potential problem, is critical. Without a clear understanding of the facts, management cannot move forward confidently; with a clear understanding of what occurred and what didn’t, management can move forward with certainty. For example, we represented a manufacturing company that decided to outsource its product safety function. A director of the safety division being outsourced demanded four times the offered severance pay and threatened to report alleged safety violations to a regulator if his demand was not met within 48 hours. We pulled 275,000 electronic documents from the division and applied analytics to determine whether the alleged violations had any merit. In less than 24 hours, we proved definitively that they didn’t. The company could move forward with certainty because we quickly and accurately ascertained the truth.
Understanding the facts with certainty is also critical in mergers and acquisitions (M&A) due diligence. M&A agreements typically include standard representations and warranties that the information disclosed by the target is correct and has a reasonable basis under generally accepted accounting principles. Such agreements often contain a provision specifying that if the acquiring party can prove that the target failed to disclose material information or that the information was incorrect, the acquirer will receive a purchase price adjustment, sometimes for many millions of dollars. These indemnity provisions are rarely acted on, however, because they require a claim to be filed within 30–90 days, and it is rare that an acquirer can prove up a claim in that short of a period. Data analytics is changing that.
In several engagements, we have used our advanced analytics-based fact development strategies to obtain millions of dollars in indemnity claims for our clients. As soon as the merger or acquisition is completed, we analyze the target’s information systems and test the accuracy and validity of its disclosures. Within a matter of days, we know whether and to what extent the disclosures were inaccurate. The resulting indemnity claims are based not upon guesses but upon information from their own systems and communications. It is very difficult to defend against words out of your own mouth.
As the examples above demonstrate, data analytics can be used to create a significant strategic advantage compared to a manual review of information. When a lawyer using analytics goes up against a lawyer who fails to use analytics, the results are inevitably skewed in favor of the lawyer who can reach the answer quickly and more accurately.
Predicting the Future
All of these examples pertained to situations where we were looking back in time to describe what happened. Wouldn’t it be better to try to look into the future and predict when something bad will happen and act to prevent it? Analytics can do that too. Analytics is being used in the commercial space to understand patterns in consumer behavior. Information is gathered regarding what we buy, what we do, and our demographic and psychographic characteristics. Millions or even billions of these data points are gathered, and a model is built upon the resulting patterns within it. These models are astonishingly accurate in predicting what we are going to buy, do, and even think.
Using the same analytics, we have built algorithms that can identify signs of misconduct as they are happening, rather than waiting for the results of misconduct to manifest themselves in situations that damage the company’s reputation or finances (or both). In building these algorithms, we have used large data sets (primarily email, chat, and texts) that had been the subject of litigation or regulatory investigations and already had the relevant documents identified. Using these documents, we have built models using text mining and analysis, social networking, and sentiment analysis. We have worked with experts in intelligence and law enforcement and with social scientists and criminal psychologists. It turns out that just as there are patterns in all human conduct, there are patterns in misconduct as well. The resulting models we have built have been astoundingly accurate in detecting and preventing misconduct and are changing compliance from a largely reactive to a more proactive endeavor. We believe that such proactive compliance systems will become the norm.
For example, it has been reported that JP Morgan Chase is testing a program it plans to deploy company-wide in 2016 to analyze email, chat, and phone transcripts to identify collusion or the hiding of intention.6 Lawyers who understand how these analytical tools work can help their clients use them to promote corporate compliance and proactively keep them out of trouble.
There is a danger, however, to these kinds of proactive monitoring systems. Although the algorithms focus on finding those who are violating policies or laws, the algorithms can be tailored to look for any pattern in human conduct, including personal opinions, voting practices, and kinds of marriages or families. Thus, there is a newly developing area called data ethics of which legal and information governance professionals need to be aware. Powerful analytics can bring immense insights that give us strategic and business advantage, but they can also be misused.
For example, several major police departments are using predictive analytics to try to identify individuals who are most likely to commit crime in the future.7 The departments use data on prior criminal history, social networks and affiliations, and many other data points to try to predict who might be involved in crime, especially violent crime. The departments then focus resources to try to intervene to prevent any such crime. Some of these efforts, such as providing drug or mental health counseling and job services, are laudable. But other efforts, such as increasing police monitoring or even public shaming, are more disconcerting. In one instance in Kansas City, the police during a community meeting put up mug shots of those it thought were the most likely to commit crime even though these individuals had been accused of no actual crimes.8
Conclusion
Information is key to every aspect of a lawyer’s job, from defending a company in the face of litigation to proactively optimizing regulatory compliance. Lawyers who think they can continue to practice law without analytics are putting themselves and their clients at a disadvantage.
Endnotes
1. 287 F.R.D. 182 (S.D.N.Y. 2012), aff’d, 2012 WL 1446534 (S.D.N.Y. Apr. 26, 2012).
2. Id. at 183.
3. Id. at 183–84.
4. Id. at 184, 199.
5. See, e.g., Rio Tinto PLC v. Vale S.A., 306 F.R.D. 125, 127, 129 (S.D.N.Y. 2015) (discussing the accepted practice of using TAR for document review and noting that TAR should not be held to a higher standard than keywords or manual review); Progressive Cas. Ins. Co. v. Delaney, No. 2:11-cv-00678-LRH-PAL, 2014 WL 3563467, at *9 (D. Nev. July 18, 2014) (indicating that the court would not have prevented the parties from agreeing to use predictive coding for e-discovery had it been included in the ESI protocol); Dynamo Holdings Ltd. P’ship v. Comm’r, 143 T.C. 183, 194 (2014) (sanctioning the use of predictive coding to produce relevant information).
6. See Hugh Son, JPMorgan Algorithm Knows You’re a Rogue Employee Before You Do, Bloomberg (Apr. 8, 2015), http://www.bloomberg.com/news/articles/2015-04-08/jpmorgan-algorithm-knows-you-re-a-rogue-employee-before-you-do.
7. John Eligon & Timothy Williams, Police Program Aims to Pinpoint Those Most Likely to Commit Crimes, N.Y. Times (Sept. 24, 2015), http://www.nytimes.com/2015/09/25/us/police-program-aims-to-pinpoint-those-most-likely-to-commit-crimes.html?_r=0.
8. Id.