chevron-down Created with Sketch Beta.
November 21, 2012

Making Sense of Data Tsunamis

Robert Grande – November 21, 2012

U.S. Magistrate Judge Paul Grewal has ordered Apple to provide Samsung access to sketchbooks and computer-aided-design (CAD) files relating to the development of its iPad and iPod products. Both companies are locked in an intellectual-property dispute over the design of tablet computers and smart phones. This is but another chapter in the protracted suit between the technology giants, with Apple being granted an injunction by the German courts in July that bars importation of Samsung’s tablets and phones. This back-and-forth is blazing trails in the realm of e-discovery, as well as in design patents. The access to the sketchbooks and CAD files must be done in a secure third-party location, adding another layer of complexity to the already complex e-discovery procedure between sophisticated parties. The first to market design dominance that Apple has been able to wield could impact the representation of emerging OEMs (Original Equipment Manufacturers).

E-discovery and document review are definitely not one-size-fits-all approaches and should be centered on people, process, and technology. All things created equal when choosing your review technology, you should start with the end result in mind, and this will help determine your process and individuals needed to meet desired goals. Your choice of technology should all be also discussed—both parties should agree on the chosen method.


Additional items to consider are the amount of data and location relevant to the case and time frame to meet desired results. All items help determine your choice in technology.


In response to Samsung's RFP no. 1, Apple shall produce all “sketchbooks,” or relevant sections thereof, relating to the four patents at issue in Apple’s preliminary injunction motion. Apple does not seriously dispute that the sketchbooks address designs at issue in this case and its burden arguments are not persuasive. While Apple has every right to review and withhold from production those sketches not at issue in the preliminary injunction motion, it offers nothing beyond attorney argument that the volume of materials to be reviewed is particularly onerous. The production shall be completed no later than September 30, 2011.


Samsung is tasked with the challenge of reviewing an onerous amount of documents in a limited amount of time. However, there are many document-review software options to choose from that can handle large amounts of data, as well as document review providers to provide licensed attorneys for relevance review. The main question which comes into play here is which process to use in culling the data for review.


Defining Document Review and Defensibility
Let’s first define what is document review and defensibility. Document review is a task performed by attorneys in anticipation of legal proceedings or during the discovery phase of litigation. Document review requires attorneys to assess the relevance and/or responsiveness of documents, using knowledge about the facts of the case and the issues of law. (Later stages of document review are sometimes called privilege review or second-level review.) Also to consider whether a document is privileged (on the basis of attorney-client communication and/or work product) and may be either withheld from production or redacted for content based on requirement and relevance.


Legal defensibility is defined as follows: An organization must proactively build a case that can withstand legal scrutiny, which demonstrates that it has done everything reasonable to protect itself and its assets in order to preserve and build long-term value. It should operate under the assumptions that it will experience a security incident, and that as a result of such an incident, it will be subject to legal proceedings (civil or criminal) that challenge whether or not it did what was necessary and reasonable in protecting itself.


Defensible and Cost-Effective Techniques
Data Culling

  • File Type: identifying system files and unique file types can allow you to remove files that may not be relevant. This should be applied prior to processing.
  • Date Range: date ranges of documents
  • Global Deduplication: remove across all custodians
  • Email domain name: identifying all email ids can remove nonrelevant names prior to processing
  • Search-terms list creations: using search technologies from Boolean (a wildcard concept), validation, and random sample of search terms can reduce population prior to production

    Document Review

  • File Type
  • Custodian Communication: identify custodians that frequently communicate with key parties and prioritizes for review
  • Clustering and/or Predictive coding
  • Clustering: organizes documents into subsets of recurring themes; these subsets can be evaluated and culled
  • Predictive Coding: rank documents by similarity
  • Review Batch Organization: logically organize document sets that contain highly similar documents
  • Prioritized Relevance: utilizing document test to identify potentially relevant documents and stage for first pass review

    Predictive coding

    This is a method of review whereby a computer program can categorize entire collections of documents as responsive or nonresponsive without further human intervention. Typically, the program ranks documents from most to least likely to be responsive based on the parameters articulated at the outset. These rankings can then be used to determine which documents warrant further attention by human reviewers to QC the decisions made by the computer. Just as no spam filter perfectly categorizes all emails as junk or legitimate, predictive coding is not yet able to perfectly identify all relevant documents.


    Concept searching

    An automated information retrieval method that searches data for the ideas expressed rather than the proximity and appearance of search terms. Using semantic and statistical algorithms, related data can be grouped. With concept searching and the right inputs, your search for “fruit” would also yield data about strawberries, guava, jams, and pies.


    Any number of software applications that organize your documents by grouping them into clusters based on the similarity of the text they contain. Then, by looking at sample documents, you can often make accurate judgment calls on behalf of the entire group.

    Deduplication and Near Deduplication

    Functionally equivalent data that also appears elsewhere can be culled from your dataset prior to review so that you do not waste time reviewing multiple copies of the same data.

    Keywords: litigation, technology, e-discovery, document review, design patents, Apple, Samsung


    Robert Grande – November 21, 2012