November 2011 | Personal Branding
Managing e-Discovery in Small to Medium Cases: Software Solutions
Each product is for the most part modestly priced given what it can do. In fact, in several cases the cost is low enough that it could justifiably be passed on to the client directly rather than being absorbed as overhead. Functionality may be somewhat limited for each product as compared to higher end but similar solutions, but they all do a good job for the specific task at hand, and may suffice for your particular needs.
Before we turn to the software there are certain assumptions that need to be addressed. First, the assumption is that you will be working with copies of live data in native format. It is also assumed that the types of files you are dealing with are typical files created by common programs used for email, word processing and other office functions. The more unique the file formats, the more likely it is that you need a higher end solution.
Another assumption is that you want to host the data yourself, and that you have the equipment and skills to handle things yourself. There are many good Internet based hosted solutions that can fill your needs, and costs for them vary widely. However, the typical storage fees charged for a case that exists for any length of time can bust a modest technology budget. Therefore, we’re looking at applications that can be installed on one computer to be used as your analysis machine, and we’re assuming the ESI is produced in a way that it can be examined using that computer. Typical examples would be files burned to CD or DVD, or produced on an external storage device like a thumb drive or external hard drive.
We’re also assuming you are not dealing with terabytes of information. Small cases typically involve smaller volumes of ESI. While the featured products can handle large volumes of data, for illustrative purposes here we're assuming the size of the collections is relatively modest. Finally, we’re assuming you have a cooperative relationship with the other side, at least in terms of dealing with e-Discovery. The single most effective way to keep e-Discovery costs low is to work with your opposition in a cooperative manner so you can stipulate to the use of low cost solutions.
Harvesting Data With One Click Collect
Imagine that you’ve just been brought into a new piece of litigation, and you determine your client has potentially relevant electronically stored information for multiple custodians on a network server. The custodians also actively save files to the local drives of their individual PCs. In addition, your client tells you they have several stand alone PC’s that are used for testing and design purposes, where the test results and related documentation are stored locally by different users. They have several locations throughout the country where these stand alone computers are used. Finally, they have a separate email server.
You need to quickly develop a plan so you can discuss the issues of preservation and collection at an upcoming meet and confer. The amount of money at stake is relatively modest, and you don’t want to break the bank with e-Discovery costs. In fact, you’d like to rely on internal client resources as much as possible. You are certain you can convince the other side that this is not a case where a computer forensics expert is needed, and you don’t want to incur the expense of multiple people descending on the various offices around the country performing full forensic imaging of every PC and server. At the same time, you know that a forensically defensible collection method should be used in harvesting the ESI, and you want the collection method used to preserve basic metadata, provide chain of custody documentation and ensure appropriate verification using hash values.
You talk to your favorite e-Discovery consultant and he tells you that Pinpoint Labs’ One Click Collect Harvester is a software product that will handle all your collection needs. In fact, under his direction, and with appropriate consultation with your client’s IT department, your consultant assures you he won’t even have to leave his office. Internal resources can be used under his direction to minimize the cost of collection. Your consultant also tells you that once the information is collected it will be easy to load it into your litigation support software because all of the copied files are copied in native format. This should lead to quick turnaround on your deliverables. This section will explain how One Click Collect Harvester could meet the needs of the above client by offering an on-site collection option without the need for the consultant to actually be on-site.
One Click Collect Harvester (OCC) is available in two versions, Portable and Server. Both versions provide for automated file collections using predefined requests that can be launched automatically when collection storage media is connected to the host computer. As the names suggest the difference between the versions is that Portable is intended to be deployed on flash drives or external hard drives to allow users to copy from an individual workstation or server, while Server can be deployed on a network attached storage (NAS) device on any shared network location. Prior to copying, OCC can be emailed to the client site and used to analyze the available data to determine the size of the storage media needed for collection. Once the size of the storage media is determined, an appropriately sized external hard drive can be shipped to the client site with the software and job tickets already installed. When properly preconfigured, the unit only needs to be plugged in to the host computer and with one click from the operator the copying process is launched.
During the copying process OCC preserves metadata and file time stamps, and creates a chain of custody log for the entire process. The log file is saved as a .csv file and can be imported into Excel for analysis and to create a written report. The files themselves are copied in native format so they are immediately ready for further processing in common litigation support programs. No conversion from a proprietary format is needed, and no additional hardware, key fob or similar item is needed for the program to work.
Thus, in the case above, the e-Discovery consultant can work with the client's IT staff to identify the computers that need to be collected, the users whose files need to be collected, and the pathing information that is needed to define the location of the source information to be copied. There are also additional options that allow keyword culling for loose files, archives, MS Outlook PST's messages and attachments. That information is loaded into a job ticket that is installed on the collection drive. The e-Discovery consultant does not need to be physically present to obtain a defensible collection. He can ship the drive to the designated IT person who can plug in the drive and launch the program. Once the process is complete and the files are copied the drive can be sent back to the consultant or the client for further processing.
The process is started by creating a job ticket. Using the Job Manager, which is a type of wizard, the person setting up the collection process fills in a number of fields of information. If additional data sources need to be identified at the point of collection a user friendly drag and drop window (Harvester ESI Vault) can be used. Network or local files and folders or MS Outlook and Exchange messages can be selected and placed in the ESI Vault on the fly. The Job Name is identified, and an Instructions field is available to contain specific collection instructions that will later show up on the job list used by the end user to launch the copy process. An Error Notes field is also available to provide contact information for help if the end user encounters problems running the specific job.
Next the Sources field is completed. This field contains that file path information for the information to be copied. The selection tool allows you to manually specify a path, a particular folder, a particular file, or you can import a file list that is a text file that contains predefined instructions. The use of a file list is a time saver when the same type of information is sought from multiple computers and helps ensure consistency. Next you fill in the Job Target Path by selecting a folder to hold the files that are collected. A similar selection is made for a Log Path for the log report.
OCC next allows you to specify the type of files to copy. This can be done in one of several ways. You can use the selection tool to select file types by category, e.g., email files, office documents, accounting files, etc., which will copy all file types in that generic category; or you can drill down to the next level by specifying individual file types in the category, e.g., Outlook for email files, or Microsoft Office 2007 files for office documents. Alternatively, you can use an extension file list to determine what is copied by importing a text file containing specified file extensions. This allows you to collect, for example, just .doc, .docx, .xls and .xlsx files.
The copying parameters are set by completing the Action field, which allows you to either include or exclude files by the file extensions selected; and by setting a date range to be applied to either the date created, date modified or date last accessed. Finally, the Computer Name field needs to be specified. This completes the basic information for a job ticket. There are further advanced options that allow you to create full paths, copy empty folders, exclude temp files, exclude system files, hash the source and destination files, and make a variety of other selection options that may be of use depending on the situation. Other useful advanced features available only in the Harvester version are the ability to apply a deNISTing filter, and to run in silent mode so the individual whose PC is being copied will never be aware the process is occurring. Another advanced feature is the ability to create scripts that can be used to create job files, start a job and launch programs or utilities that can work with the data capture and the files collected.
OCC is then run and the files are automatically copied to the specified folder. A log file is automatically created and stored. This .csv contains chain of custody information. Later it can be imported into Excel and you can generate a report that specifies the date and time a file was copied, whether the hashes matched, the source path, the source created, modified and accessed dates, the source MD5 Hash, the destination path, the destination created, modified and accessed dates, the file size and the destination MD5 Hash.
Another more advanced feature to be aware of is the fact that you can easily activate and deactivate the license. This means that once a job has been run using the program installed on one collection drive, it is easily deactivated and the program installation on another external drive for another machine can be activated. If you sent a drive to Los Angeles and another to New York, as soon as the collection in Los Angeles was done and the software was deactivated, the software on the New York drive could be activated and the collection run. You can even email the program and the job tickets. In the above example, the e-Discovery consultant could email the Harvester Portable to the client's IT person. The IT person could load it on a thumb drive so it could be moved from machine to machine. OCC Harvester also includes support for Truecrypt volumes. The IT person could also procure the external hard drives himself, load the software on the drives along with the job ticket prepared by the consultant, plug in the drive, activate the software and start the collection process. If the IT person chose to create a Truecyrpt partition or volume on the external media Harvester could seamlessly copy files to the encrypted container insuring your files are safe during transport.
In the hypothetical we used a discovery consultant to set the parameters for collection, but it should be obvious OCC can be used without the need for a third party consultant. Particularly in large in-house settings with a knowledgeable IT staff and an attorney experienced in e-Discovery matters, the process could be done completely in-house, and the attorney could present a collection of files in an electronic production that is defensible. It goes without saying, however, that this product is intended to be used for collection of live data only. Therefore, before it is used the parties should agree that forensic imaging is not required and no special preservation steps are necessary. You should also understand that it cannot be used to recover deleted files.
In terms of training and support, the help files that come with the software are very complete. In addition, there are several training videos on the Pinpoint website. The folks at Pinpoint are also very responsive and willing to help you configure your job tickets or do whatever is necessary to successfully deploy their product. In about an hour you can easily learn how to use One Click Collect. In terms of cost, a license of Harvester Portable costs $1196.00 for the software, $299.00 for annual maintenance, for a total of $1495.00. A three seat Harvester Server license costs $4478.40 for the software, $1119.60 for annual maintenance, for a total of $5598.00. Unlike other applications that have additional per user, gigabyte, or custodian fees Harvester use is unlimited. Additionally, the Portable license does allow you to collect from network locations, which is usually on available on a Server or Enterprise license with other companies.
Managing Email e-Discovery With Adobe Acrobat X
For example purposes we'll consider Microsoft Outlook email, though the process is similar for Lotus Notes. The conversion process does require some customized set up of Outlook so you can import the .pst and .msg files that are to be produced into the Outlook program installed on the processing machine. Once the files are properly loaded into a folder in Outlook the conversion process to follow to create PDFs is simple. There are some limitations that you need to keep in mind, however. Acrobat has a limit of 10,000 emails per conversion, so if you have a large volume of email to process you'll need to break them down into smaller subsets. It takes time for the conversion process to run its course so be patient, though obviously the more powerful a computer you have (faster hard drive, more RAM and faster processing speed) the quicker you'll be able to complete the conversion.
Once the conversion is complete and the Portfolio is created, it can be viewed in a flattened or foldered view. The default view is the flattened view showing all emails from all folders and recipients. In terms of filtering you can choose a field to filter, choose operators to apply (e.g., field contains, does not contain, starts with, ends with, etc.) and add search text. You can even have multiple levels of filters (e.g., all messages from John Smith where the subject line contained "Jill" or "Bob" or "Ice Cream." Files can be marked for privilege and subsequently filtered to remove privileged items prior to production. The layout of the Portfolio is also customizable.
Email attachments are not converted into separate PDFs. Instead, they are embedded in the PDF email message to which they were originally attached in their original format. They can be reviewed using their native application. Once the initial Portfolio is created, it can be reviewed and nonresponsive or privileged items can be deleted and a subset Portfolio can be created to produce to the other side. It is also possible to convert the Portfolio to a PDF Binder which is a single PDF document with multiple pages including attachments.
The cost of Adobe Acrobat X from the Adobe website is $449 as a new purchase, and $199 as an upgrade of an existing license.
Managing Document Collections With dtSearch
dtSearch will process the files for you by first creating an index. This may initially be time consuming, but once all the files are indexed, the search speed as you search through your collection seems almost instantaneous. There are many different search tools and techniques that come standard with the program. They include stemming (e.g., search on apply returns hits of apply, applies, applied, applying); concept searching using synonym or thesaurus searching (e.g., search for incendiary returns incendiary and arsonist and inflammatory, and if related words are also chosen, combustible and bomb would also return as hits); Boolean, Phrase, Proximity, Wildcard, Fuzzy, Phonic, Field and Numeric Range searching. Even natural language searching is supported!
The initial indexing process is a simple process to perform. Using the Create Index dialog box you name your index and select a location for it to be stored. You'll be asked if you want to add documents to the index. You then select from a short menu of available actions to be taken, add what you want indexed either in the form of folders, files, Outlook, or even the web. You can apply extension filters to exclude certain types of documents or files. You then select "start indexing" and let the program do its job. When the indexing is done you can move on to searching.
When it comes to searching, the user interface is simple to use. It consists of a search pane that contains a word list that you can select from, showing the number of incidences of each word. If you have multiple indexes set up you can search more than one by selecting the indexes you want searched. You can also specify whether you want to search by word or by Boolean search. You can build a query in the query box using the built in connectors. You can determine if you want to include stemming, fuzzy, phonic or synonym searches. Click on search and off you go. There are additional search options that can be selected that filter results based on a limitation of the number of hits, matching certain names, setting date ranges or setting file sizes. A search history is also created so you can return to your results at a later time.
Once the search is complete the search results window will open. The top half of the window lists the files retrieved in the search and the lower half will show the document selected in the list in a viewer with the search term highlighted. The file list contains the name of each document, the relevancy score, the number of hits, the file location, the date created and the title. The viewer format changes depending on the format of the document to be viewed. A button bar includes navigation tools to move between hits within the document, or from document to document forward or back. If desired, the selected document can be launched in its native application. If you find a document that is relevant, you can save it, copy it or print it. dtSearch Desktop with spider offers an incredibly sophisticated search tool at a bargain price. Learn more at www.dtsearch.com.
Processing and Managing ESI with Breeze e-Discovery Suite
The EDD processing module allows you to process electronic documents and convert them to TIF with a load file created for popular litigation support programs like Summation, Concordance, TrialDirector and others. You can also create a load file for the electronic documents leaving them in native format to be worked with as e-documents in your litigation support program.
The most recent offering from Breeze is the new BreezeDocs case document management application. More on this further in the presentation. When called upon to produce electronic information for your less advanced opposition, i.e., they want it in paper, the printing module provides you with the necessary tools to make a high volume blow back production. Finally, the load file tools module contains specific tools that allow you to work with and manipulate Summation DII files. The cost for the Breeze e-Discovery Suite is $3,495 per concurrent user, which includes the first year maintenance fee. Thereafter, ongoing maintenance is $699; however, it's not required, just encouraged.
It's not uncommon to receive a CD or DVD of imaged documents from the other side as TIF images or PDFs. For some people this constitutes an electronic production. While you should always consider whether you want to specify native format for productions, sometimes the electronic paper equivalent is OK. This is particularly true if you use some type of litigation support product like Summation or Concordance. Using the image processing module you can easily select the location of the electronic files to be processed by browsing to the proper folder. You can optionally select processing images from subdirectories. You then move on to numbering to establish the desired scheme for printing DOCIDS on the documents. You can select a prefix to use, and then the available page counter will automatically increment by page. You can predefine the number of digits to use, and use as many leading zeros as you wish. You can also start at a defined number greater than one, which is helpful when adding additional documents to an existing collection.
When processing multi-page TIFs you have the option of numbering each page incrementally or you can number by document with a separate suffix incrementing on a page basis. You also have custom footer fields available where you can add legends like "Confidential" or "Attorney Eyes Only" and select where the stamp is placed on the documents. In terms of output you select an output folder by browsing. You then check off whether you want to OCR the documents, and whether you want to create any particular load files. For DII files you can decide whether to include EndDoc and you can indicate if you are using the Enterprise version. Finally, you start the processing and when you are done, you have a collection of bates numbered documents in the output folder. The specified load files will appear there as well. In addition a .csv file is created which can be loaded into Excel to serve as an index.
In terms of EDD processing you can use this tool to take a collection of electronic files in multiple formats and process them for further use. If you use a litigation support tool you will probably just want to process them as electronic documents and just create a load file so they can easily be loaded into the document database. However, you have the option to create TIFs from the electronic files, including TIFing any email attachments. As a result, if you choose, you could number the documents and produce them as TIFs or even blow them back to paper. You can also OCR the related TIFs. The TIFS and OCR can be loaded into a litigation support program as well. There are additional higher level features. You can use custom lists to specify which documents to process. Finally, you can create a deduped master, and even run subsequent productions against the master for additional deduplication.
The printing module allows you to print from existing files or from load files. You can select whether to insert slip sheets between documents, print as color or as black and white, and select the print range. The Image Manager in the document tools module allows you to split existing documents by inserting page breaks or combine pages by deleting a page break. This can be very useful in converting a large volume of single page TIFs into the discrete multi-page documents within the collection. The collection can be reprocessed after modification and new load files can be created.
BreezeDocs will dramatically change the way you look at automating case documents. Before BreezeDocs, your only choice for document management was Summation, Concordance, Case Logistix and a few other tools. BreezeDocs changes the playing field. With two flavors of the product; 1) a review and coding tool at only $99 per seat or 2) a full import, export and bates numbering module at $499 make digitizing all of your case documents truly a Breeze. BreezeDocs came from the spinning out of the Image Manager from the core Breeze product into a separate standalone application. Breeze added the native review feature so that you can now process Native Files with Breeze and do a review with BreezeDocs without ever having to tiff the data. Then only convert to TIF the files you need, not the entire set.
Bruce A. Olson is President of ONLAW Trial Technologies, LLC, www.onlawtec.com, a trial technology, e-Discovery and computer forensics consulting company.