In our last article, we wrote about various analytics tools available in software platforms for legal document review. One tool we identified is technology-assisted review (also known as “TAR” or “predictive coding”). Six or seven years ago litigants began using TAR with little assurance that courts would accept the methodology. Since then, numerous U.S. and international courts have accepted the use of TAR. In short, TAR is now considered mainstream.
The Process of Technology-Assisted Review
TAR is a review strategy that combines attorney review with computer algorithms to determine which documents are relevant for purposes of document production. This necessarily reduces the number of documents attorneys need to review. Each document review platform has its own method of TAR, but generally, there are two main types of assisted review: one where the documents are reviewed in rounds of sample documents until the system has been appropriately trained, and another type called “continuous active learning” where the system is trained as it goes with each new document reviewed.
In either type, the computer is using coding done on a subset of documents to predict how an attorney would code the entire data set. The subset of documents initially reviewed can be created using a variety of methods, such as random sampling, stratified sampling (which attempts to pull documents from concepts the computer has identified rather than a truly random sample), by using search terms, or by starting with a set of documents already reviewed or otherwise known to be relevant. The system then uses an algorithm to determine the responsiveness of the remaining documents. Depending on the system, it can be best to have a senior attorney who is very familiar both with the case and with how to choose good example documents for TAR to train the system. However, newer models claim to be more adaptable and more able to discern between discrepancies in coding such that less senior attorneys could train the system. Discuss technology changes with your vendor or litigation support team to understand what level of knowledge is necessary to implement the review.
Perhaps the most difficult decision in a TAR project is determining when the project is complete. The decision to end a project must be validated. Most attorneys focus on the statistic known as “recall” (the percentage of responsive documents located) to validate TAR. There is no magic number. As a general guideline, however, courts have approved protocols where the parties committed to show that they likely had found 75 percent of the responsive documents. In a recently published protocol, a respected e-discovery special master noted that recall between 70 and 80 percent is consistent with a quality review, so long as the quantity and nature of the documents missed is considered.
It may be surprising to hear that courts approve of a system that leaves 25 percent—or even more—of the relevant documents unproduced. However, in validating the process, the producing party will evaluate whether the missed relevant documents in the sample set used to validate results are important documents or are of marginal value. If key documents contained in the validation set were categorized as nonresponsive, the producing party will need to carefully consider why the document was missed, whether steps can be taken to locate other documents that may have been missed, and whether it is appropriate to continue the training process instead of calling it complete. Such an inquiry will be case-specific.
Courts have addressed this concern and counter that it mistakenly assumes that human review is the gold standard. In fact, humans frequently apply the criteria used for responsiveness inconsistently, resulting in recall rates lower than those reached using TAR. Courts justify TAR by looking to Federal Rules of Civil Procedure 1 and 26, with the focus being on just and speedy discovery that is proportional to the needs of the case. Courts are quick to point out that discovery is not meant to be perfect—it is meant to be reasonable—and TAR should not be held to a higher standard than any other document review process. In defending a TAR process, the focus will be on the defensibility of the process employed and a cost-benefit analysis of what else would need to be done and at what cost to locate additional responsive documents.
The Cost of TAR
In many cases, using TAR will be faster and less expensive than manual review of large sets of documents. Exactly how much can be saved depends on the goals of the TAR. If a party intends to review every document considered responsive by TAR before it is produced, the savings will not be as pronounced as they would be in a case where the parties are simply going to produce everything that is considered responsive with only a limited privilege review. And if a document set consists largely of documents that are not good fits for TAR (such as spreadsheets without a lot of text, pictures, drawings, graphs, etc.), the parties will need to consider that they will have to supplement the TAR process with manual review of that subset of the documents.
One way to save up-front costs of TAR is to reduce the size of the document population it is used on by running search terms before implementing TAR. Initially, there was a clear preference for not culling documents before using TAR. The tide seems to be turning on this issue, however. The savings offered by TAR can be reduced or even completely nullified by prohibiting initial culling because the data sets are so large if not culled. Thus, courts have recently started acknowledging that it may be reasonable to cull with keywords before running TAR. At industry conferences, even the most reluctant jurists are coming around on using pre-TAR search terms. It is our recommendation that parties should consider whether there are broad terms that could be run to reduce the population before implementing TAR.
Transparency in TAR
When using TAR, a party will need to decide what exactly it is willing to share with the opposing side regarding the process. Many courts stress that transparency between the parties when TAR is used can keep the discovery disputes to a minimum. There certainly are benefits to getting approval and buy-in from opposing counsel before starting TAR, such as decreased discovery disputes and a more civil, inexpensive discovery process.
Yet, it is difficult in practice to determine exactly what level of cooperation and transparency is required. Courts have noted with approval that some litigants willingly provided the documents that were used to train the system—both responsive and nonresponsive—to opposing counsel. Absent a showing by the opposing side, however, that a document production may have been insufficient or other special circumstances, most courts would not likely require that level of transparency. And for some types of TAR—most notably those using continued active learning—every document reviewed is training the system, so it would not be appropriate to provide the documents to opposing counsel.
Ultimately, the party producing the documents is in the best position to determine how to review its documents. That party’s decisions should stand unless the receiving party can show a deficiency in the production. If a party chooses to use TAR without involving the other side in the discussion, however, there is always the risk that additional work will have to be done to appease opposing counsel or to defend the process in court.
Copyright © 2018, American Bar Association. All rights reserved. This information or any portion thereof may not be copied or disseminated in any form or by any means or downloaded or stored in an electronic database or retrieval system without the express written consent of the American Bar Association. The views expressed in this article are those of the author(s) and do not necessarily reflect the positions or policies of the American Bar Association, the Section of Litigation, this committee, or the employer(s) of the author(s).