E-discovery Tools and Strategies to Save Time, Reduce Costs, and Improve Outcomes

  • Upon receiving electronically stored information (ESI), your first step is to find the right processing platform. Be sure to confirm that it offers an intuitive user interface.
  • Processing e-discovery involves techniques to streamline and reduce the volume of your data, including de-duplication, email threading, keyword searching, and data culling.
  • Once your e-discovery has been processed, you must organize it. One of the primary strategies for data organization involves tagging and categorizing.
Imagine it. You recently took on a significant employment discrimination case whereby your client—a former employee of a large multinational corporation—alleges wrongful termination based on discriminatory practices. The corporation has a sizable in-house legal team and contracts with a top-tier e-discovery firm.

As you begin the discovery process, the magnitude of the task at hand becomes clear. First, the corporation produces a massive amount of electronically stored information (ESI), including years’ worth of emails, human resources documents, internal communications, performance reviews, and policy manuals, stored across various digital platforms. And, much to your surprise, the ESI also includes audio files, text messages, and even video footage from a variety of cameras. You learn that the ESI spans different types and formats (including legacy system files), cloud storage entries, and even encrypted data.

Your arsenal, however, is modest. With litigation deadlines looming, the pressure mounts on you to efficiently review the materials. And, once you have reviewed the ESI produced, you still need to figure out how to summarize, share, and present the materials during depositions, evidentiary hearings, and trial.

Amid all the protocols necessary following the receipt of ESI, few are more important for cost and time savings than processing and organizing your e-discovery—the goal of processing being to transform the raw data into a more manageable state whereby you can start to organize and strategize. This article will address the steps of processing and how best to organize your e-discovery. It also includes tips on how you can use that processed data with new artificial intelligence (AI) tools for the next step: drafting. Finally, it touches on how to use new tools to present your evidentiary materials at depositions or adversary proceedings.

First Step: Select a Processing Platform

Upon receiving ESI, your first step is to find the right processing platform. Many vendors cater to solo practitioners. When selecting your vendor, be sure to confirm that they offer an intuitive user interface for the platform. No solo practitioner wants to spend time with extensive training on using the platform. You also need to determine accessibility from various devices—confirm you can access the system from home or from court. Ensure there are strong data security measures with end-to-end encryption to protect sensitive information.

The data processing platform you choose should have the ability to (1) remove system files and nonrelevant data, (2) unpack compressed files, (3) convert different file formats into a consistent format for review, and (4) provide advanced searching abilities to filter the data. You also want to make sure that you can easily produce and export documents in formats pursuant to court requirements.

What Is the “Processing” of E-discovery?

Once you have selected your vendor and the data is uploaded into the platform, the actual processing can take place. Processing of e-discovery involves using a range of techniques to streamline and reduce the volume of your data. Techniques include (but are not limited to) de-duplication, email threading, keyword searching, and data culling.


When opposing counsel produces vast quantities of ESI to you, multiple instances of the same file across the custodians’ data usually exist. This redundancy inflates the amount of data you need to review. De-duplication helps eliminate duplicate copies of documents. It works by comparing documents in the dataset against each other to identify and remove duplicates. This comparison can be completed using hash values—a unique digital fingerprint for each file—ensuring that only identical copies are treated as duplicates. You can manage this process on two distinct levels: globally or by custodian.

Global de-duplication removes duplicates across the entire dataset, leaving only one instance of each document regardless of the custodian. It offers the potential for the greatest cost savings. On the other hand, custodian-based de-duplication maintains a copy of a document for each custodian who has it. This version is often used when it is important to show that multiple parties had access to the same document. It is less aggressive than global de-duplication but still helps you reduce the dataset to a more manageable size.

Another form of de-duplication is “near-duplicate detection,” which can group documents that are almost identical, except for minor differences.

Effective de-duplication requires a deep understanding of the case’s context to determine the scope. While de-duplication is only one part of the e-discovery process, it plays a pivotal role in achieving your goals of efficiency and cost-effectiveness.

Email Threading

Email threading is another processing technique to save you time. Email threading sorts and groups email messages into their original conversational chains, or “threads.” It helps you understand the context and flow of a conversation without having to review redundant emails. Specialized software (usually offered by your processing firm) can identify and group together related emails by analyzing subject lines, sender and recipient information, dates, and message bodies.

Email threading technology can identify “inclusive” emails within each thread. These are the emails that contain the entire or most complete history of the conversation. By focusing on these inclusive emails, you can achieve the complete context without having to read every email in the thread. Email threading can also identify gaps where emails might be missing from the thread, which could strongly suggest potential withholding of information in the collection process.

Keyword Searching

In processing your e-discovery, you should always use keyword-searching functionalities. With keyword searching, you can designate keywords to locate potentially relevant documents within a large volume of ESI. It culls down data to a manageable size. Craft a list of search terms such as names, specific dates, relevant locations, unique project names, industry-specific terminology, or legal phrases. The goal is to create a keyword set comprehensive enough to retrieve all pertinent documents while excluding the irrelevant.

Keyword search in e-discovery is nuanced. You must consider the way people communicate. You must also account for potential variations in language. This could mean utilizing wildcards (special characters used as placeholders to represent one or multiple characters within a search term, such as “calcul*” for all variations of “calculate”) and fuzzy searching (i.e., including different word endings, misspellings, synonymous terms, or alternative spellings in your search). Employ proximity searches to find where two or more keywords appear within a certain distance from each other (e.g., “accrued /5 unpaid”).

Keywords are not perfect; therefore, the search is often supplemented with other e-discovery methodologies, such as technology-assisted review (TAR) (i.e., coding a set of documents for relevance to the issues) or concept searches to make sure no critical document is overlooked just because it doesn’t contain a specific keyword.

Keyword searching is a blend of legal acumen, linguistic awareness, and technical proficiency, all geared to pinpoint the best documents to tell the story of your case. Despite the rise of more sophisticated analytical tools, keyword searching remains a fundamental processing tactic.

Data Culling

Data culling also streamlines the review process. Culling refers to the process of systematically reducing the volume of ESI to a relevant subset. Primary culling techniques include some of the following: (1) date range filtering, (2) file type selection (certain file types, such as temp files, can be deemed irrelevant from the dataset outright), (3) domain and custodian filtering, and (4) concept clustering (advanced software can group documents by concepts to eliminate clusters that are off topic). Data culling helps reduce the number of documents you need to review.

Organizing Your E-discovery: A Framework for Efficiency

Once your e-discovery has been processed, you must organize it. One of the primary strategies for data organization involves tagging and categorizing.


Tags function as digital labels that enable legal teams to quickly sort, filter, and retrieve documents. Before your review process, develop a tagging protocol that defines each tag and its use. For example, you can tag items “relevant,” “hot,” and “non-responsive.” You can also code tags by different issues in the case. You should use potential tags in a hierarchical manner, from broad categories down to specific issues (also known as parent and child tags). Bulk tagging a specific custodian or date range is also helpful. Some e-discovery platforms even employ AI to help and will suggest tags based on document content and previously tagged examples.


Categorization involves grouping documents into clusters based on characteristics (i.e., legal issues, custodians, sources). This facilitates issue-centric review. Categorization, like tagging, should be an iterative process, changing as the case unfolds and more information becomes available. Again, data processing platforms can automate categorization by analyzing document content and metadata and making suggestions to reduce manual review time. You can then use your data processing platform to integrate tagging and categorization into a larger suite of review tools for redactions and annotations.

Annotating E-discovery: Adding Value to the Raw Data

After you have tagged and categorized your e-discovery production, you will want to add notes and comments or otherwise highlight or mark documents to help formulate your case strategies. Most e-discovery platforms have built-in annotation features. They become part of the metadata of the document within the e-discovery platform, which means they are searchable and can be included in reports or exports.

Your annotations can be color-coded or otherwise categorized to denote the type of note (e.g., issue versus theme). Materials can be annotated as being germane to a particular witness, which helps in crafting a deposition line of questioning. And, to the extent your client also has access to the e-discovery, you can collaborate through your annotations. Put simply, annotations allow for deeper insight, better preparation, and improved collaboration.

AI Tools to Further Help Your E-discovery Review

Once your e-discovery has been processed, organized, and annotated, you can use additional AI tools to cut even more time and cost. For example, Thomson Reuters CoCounsel Core offers a task-oriented set of AI skills based on legal workflows. If you have large volumes of text, it can generate concise and accurate summaries. Offerings from Lexis, Wolters Kluwer, and others provide various task-oriented AI-skills as well. CoCounsel, as an example, can summarize lengthy depositions. It can also draft correspondence, outlines, pleadings, and even motions relevant to your e-discovery. You can even use it to create a timeline for understanding the sequence of events that transpired in a case. All this amounts to significant time and cost savings for you.

Presenting E-discovery: Making a Case in a Deposition, Evidentiary Proceeding, or Courtroom

Your e-discovery has now been processed, organized, and annotated. The question remains: How can you best present it all to your client, a witness, or the court? Utilize presentation software such as Thomson Reuters Case Center, TrialPad, or Trial Director. Using Case Center, you can upload all germane proposed evidentiary materials for a proceeding and invite others into the matter to view the materials. In a live, virtual, or hybrid proceeding, you can direct anyone with access to the materials to the right page within the evidence with a touch of a button (and in real time). And with its unique presentation capabilities, you can “call out” key portions of evidence so that everyone can concentrate on it. You can even use a witness portal to provide access for witnesses to view—and mark up—evidentiary files. TrialPad, an iPad-based app, allows lawyers to manage and present evidence, including video clips and audio files, directly from their tablets.

Master the Process

Effective e-discovery processing reduces costs, saves time, and decreases the risk of error or oversight, but perhaps most importantly, it equips you with the best evidence to advocate for your client. As the amount of ESI continues to grow exponentially, mastering the e-discovery process will become not merely beneficial but essential for legal practitioners in all fields.