In the field of e-discovery, where tackling mounds of data is the greatest challenge, technology is indispensable. Specialized applications enable legal teams to search and review volumes of data that would be beyond the practical capacity of purely human review. These applications speed the process of finding and producing relevant documents and weeding out those that are privileged.
Generally speaking, the e-discovery applications that power this process are of two kinds. There are applications hosted in the “cloud,” also known as software-as-a-service or SaaS. And there are locally hosted applications, sometimes referred to as enterprise software or appliance-based software.
For lawyers involved in major e-discovery matters, choosing the right platform is critical. Whether within a firm or a corporation, the lawyer must weigh the pluses and minuses of each type of platform and select the type most suited to the matters at hand.
Many articles could be—and have been—written about the benefits of the cloud over a local installation, and vice versa. Advantages of cloud applications include on-demand availability and scalability to meet the power and capacity demands of any size of project. Advantages of enterprise software include direct control and access.
However, one point of comparison on which little has been written is cost. Evaluating the cost of the cloud versus local is difficult because it’s like comparing apples to oranges. The costs that go into one are not identical to those that go into the other. This creates confusion in the market, with one approach or the other being touted as lower cost, without taking into consideration the full panoply of related expenses.
TOTAL COST OF OWNERSHIP
The most accurate comparison of these competing platforms is one that looks at the total cost of ownership—i.e., one that factors in all the direct and indirect expenditures required to use and support the application. These include software and hardware, infrastructure, and operating and personnel costs, among others.
With help from several e-discovery experts and analysts, we examined the total cost of ownership (TCO) of cloud-based versus enterprise-based e-discovery applications to see how they compared. The result was dramatic. Using our most conservative figures, the cloud produced cost savings of 36 percent—$2.3 million—over the enterprise software.
CONSTRUCTING OUR COMPARISON
To compare the costs, we constructed a study of a hypothetical, but typical, e-discovery client—a large law firm with a mix of large and small cases—and analyzed the total costs over a three-year span, using either a cloud or an in-house e-discovery platform.
Our hypothetical client was a law firm managing 200 small cases of 25 GB each and 25 large cases of 200 GB each over a span of three years. That adds up to 10 TB of data, but because it is rare for all the data to arrive at once in a case, we spread it out over the three years, for 3,333 GB per year. (For technological abbreviations or terms unknown to you, consult the glossary on page 55.)
We also assumed that the total data would be culled at a rate of 67 percent—the average rate reported by a recent industry survey—bringing the annual quantity of data to 1,100 GB after culling. We further assumed a maximum of 500 users on the system, a combination of attorneys, reviewers, project managers and others.
To establish the expenses to build the TCO model, we did the following:
- Selected popular in-house processing and hosting platforms that are widely available in the market today to compare against a typical cloud-based platform
- Obtained actual quotations from hardware and software suppliers
- Calculated annual hardware and software maintenance fees at 20 percent of the up-front capital expenditures
- Accounted for technology refresh by giving hardware a three-year useful life, which is the typical schedule in the IT industry
- Excluded full redundancy for the in-house platform
- Excluded business impact due to downtime because it varies too greatly from company to company
This last item is significant. Server downtime is a real risk businesses face. If downtime costs are included, then the cost-effectiveness of an on-demand, cloud-based service is even more dramatic.
To perform our cost comparison, we examined the following expenses:
- Up-front costs, which include the initial procurement costs, such as hardware equipment and software licenses
- Other one-time fees, which consisted primarily of processing by the cloud provider to ingest and cull ESI
- Recurring fees, which include monthly hosting charges, annual software subscription fees and annual hardware maintenance fees for upgrades and technical support
- Ongoing operational expenses, which include the costs of data center co-location, point-to-point connectivity, office real estate and staff to support an in-house appliance
Let’s look at these one by one.
Up-front costs. One acknowledged advantage of the cloud is the absence of start-up costs. Because the cloud provider hosts and maintains the application on its own servers, users require no up-front investment for hardware and installation. The following table illustrates this:
Sytstem Installation Expenses
|Cost of Servers||$0||$132,000|
|Cost of Storage||$0||$200,000|
|Cost of Backup Library||$0||$40,000|
For our analysis, servers and storage were configured to meet the specification requirements of the selected processing and hosting platforms. Servers were configured to fulfill Web/application, processing, search, analytics and database roles.
Other one-time costs. These include site setup, processing and productions. The following table shows our estimates of typical fees:
OTHER ONE-TIME COSTS
|Site Setup Fee||$112,500||$0|
TOTAL - Year 1
TOTAL - Year 2
TOTAL - Year 3
For the cloud platform, the site setup fee includes site consultation, instructor-led Web training and setting up standard fields, review forms, dynamic folders and user accounts.
The processing fee includes ingestion (i.e., the extraction of metadata, text and natives files) and culling (i.e., filtering the data via deNISTing, de-duplication, file type filtering and date filtering).
There would be no processing fees for the in-house platform because the equipment costs and software licensing are accounted for in other expense categories.
Recurring fees. Although both cloud and in-house applications involve recurring fees, the fees differ widely in nature, as the table at the top of the page shows.
|Annual Recurring Service Fees - Year 1||$330,000||$0|
|Annual Recurring Service Fees - Year 2||$660,000||$0|
|Annual Recurring Service Fees - Year 3||$990,000||$0|
|Annual Hardware & Software Maintenance Fees||$74,400|
|Annual software subscription fees – Processing (ingestion, culling, productions) platform – Hosting platform||$0||235,000|
|Annual User Licensing Fees - Year 1||$0||$79,200|
|Annual User Licensing Fees - Year 2||$0||$278,400|
|Annual User Licensing Fees - Year 3||$0||$480,000|
TOTAL - Year 1
TOTAL - Year 2
TOTAL - Year 3
The in-house appliance would incur annual recurring fees relating to hardware maintenance and software subscriptions associated with the processing and hosting platforms. Again, for the purpose of this case study, we have selected popular processing and hosting platforms that are widely available in the market today. We have selected platforms that are capable of processing and hosting 10 TB of data over a three-year period. To keep it simple, we have calculated the hardware maintenance fees, entitling the buyer to upgrades and technical support, at 20 percent of the up-front capital expenditures.
For the cloud-based application, there are no maintenance or licensing fees. There would be a recurring monthly hosting fee, charged by the gigabyte. Assuming that the cull rate is 67 percent, then the data being hosted is 1,100 GB the first year, 2,200 GB the second year and 3,300 the third year.
Ongoing operating expenses. Just as the cloud platform required no up-front costs, it also requires no ongoing operational expenses. The same cannot be said for the locally installed platform, as this table illustrates:
ONGOING OPERATING EXPENSES
|Data Center Co-Location||$0||$60,840|
|E-Discovery Staff - Year 1||$0||$660,000|
|E-Discovery Staff - Year 2||$0||$901,000|
|E-Discovery Staff - Year 3||$0||$1,262,500|
TOTAL - Year 1
TOTAL - Year 2
TOTAL - Year 3
Ongoing operational expenses required to support the in-house platform include:
Data center co-location to house hardware equipment and provide redundancies in power, cooling and 24/7 manned security versus an on-premise server room.
Point-to-point connectivity between the data center co-location and office. Due to very high traffic volumes with processing and hosting ESI, we have factored in a dedicated GigE link offering speeds up to 1000 Mbps.
Real estate cost for staff office space. We have estimated the real estate space to be 2,000 square feet at $30 per square foot annually, to accommodate a staff of seven.
IT staff includes one network administrator, one help desk analyst and one database administrator to manage and maintain the infrastructure. We have also included one programmer to assist with customization projects.
E-discovery staff includes one e-discovery manager and three e-discovery analysts to support the in-house appliance. We have budgeted for three project managers in the first year, five in the second year and eight in the third year.
This final point regarding staff is crucial. For enterprise applications, one of the hidden costs that buyers often fail to factor in is that of hiring experienced personnel to run and support the system and process. IT staff can maintain the infrastructure, but they are not capable of operating the software, sufficiently supporting users, managing expectations, and facilitating communications between internal clients, end users and operations.
For this reason, our TCO analysis accounts not only for IT staff but also for experienced e-discovery staff. Specifically, our analysis accounts for two types of e-discovery personnel:
Project managers. Project managers facilitate communications between the internal clients and e-discovery operations staff. They are the single point of contact responsible for overseeing the data moving through the workflow model. They provide first-line support, and serve as trainers and experts knowledgeable about the adopted in-house platforms.
E-discovery operations staff. These consist of a manager and analysts who are responsible for operating the software and exporting/importing data. In other words, these are the “doers” who get the job done. They also provide second-line support as a last step before having to contact the software development vendor.
To set the salaries for e-discovery staff, we used the average salaries identified by the Cowen Group in its 2011 salary survey of law firm litigation support staff. For IT salaries, we used data from CBSalary.com.
BOTTOM LINE: THE CLOUD SAVES 36 PERCENT
When the total costs over the three years are added up, the final cost for the cloud platform is $4 million versus $6.3 million for the in-house platform. That represents a savings of 36 percent with the cloud platform. The table below summarizes the numbers:
|Other One-Time Costs||$682,700||$0|
|On-Going Operating Expenses||$0||$1,112,540|
|Other One-Time Costs||$682,700||$0|
|On-Going Operating Expenses||$0||$1,353,540|
|Other One-Time Costs||$682,700||$0|
|On-Going Operating Expenses||$0||$1,715,040|
Thirty-six percent cost savings using the cloud over an in-house appliance is clearly dramatic—and probably far greater than many would have expected. A further advantage of the cloud that these numbers do not show is that not only does it require no up-front capital investment, but it also provides the flexibility to quickly ramp up when activity increases and to terminate costs when the project is finished. With an in-house platform, operating expenses continue, regardless of the level of activity, and there is constant worry about the investment becoming an idle money pit.
INTANGIBLE COST CONSIDERATIONS
Not every cost associated with an e-discovery platform is capable of precise calculation. There are intangible considerations that range from the experience, reputation and stability of the vendor to the security of the infrastructure and the defensibility of the process. Our TCO analysis could not factor in these intangibles, but lawyers must consider them whenever selecting a platform.
One significant variable in the overall cost of e-discovery is review, that process of reviewing documents and tagging them as responsive or privileged. Our TCO analysis omits review because it is difficult to measure with precision. However, there is evidence that the right platform can improve the speed and accuracy of review and therefore lower its cost.
Two other intangible but essential considerations for legal professionals are defensibility and professional responsibility.
In e-discovery, defensibility encompasses any aspect of e-discovery that a lawyer may have to defend in front of a judge. Your choice of platform can bolster or weaken the defensibility of your process. A platform should provide robust tracking and reporting capabilities so that you can oversee and document the steps you take, and it should provide for collaboration, redundancy and quality control so that the possibility of a misstep is minimized.
With regard to professional responsibility, a lawyer has an ethical duty to ensure the confidentiality and security of client information. Ethical opinions make clear that this duty extends to technology. Before using any platform, cloud or local, for e-discovery, the lawyer should take steps to ensure that the platform and the vendor meet the high standards that ethics rules demand.
As a longtime trial lawyer who has been steeped in legal technology throughout my career, I am convinced that the cloud is the better option for large-scale e-discovery. Of course, every legal team must make its own independent determination about which platform is best. While a number of factors are relevant, cost is always near the top of the list. As our analysis shows, when you evaluate the total cost of ownership of cloud versus local platforms, the cloud wins, hands down. LP
Glossary of Technical Terms
- deNIST: To remove program files found on the NIST list from the electronic file population before submitting that population to a search and review platform. NIST files do not contain user-provided text content, and are rarely reviewed or produced during discovery. (See NIST below.)
- ESI: An abbreviation for “electronically stored information.” This term is used in the Federal Rules of Civil Procedure to describe the files subject to electronic discovery.
- Full Redundancy: This means that there is a second copy of all hardware and software used in a system. If one device fails, the system can switch over to the other with minimal interruption for the user.
- GB: Refers to a gigabyte, which is the equivalent of 1,024 megabytes (MB) or 1 billion bytes of data.
- GigE Link: A high-speed connection between an office and a data center. It is short for Gigabit Ethernet, which can transfer 1 million bits per second using the ubiquitous Ethernet networking protocol.
- MB: Refers to a megabyte, which is the equivalent of 1,024 kilobytes (KB) or 1 million bytes of data.
- Mbps: Stands for “megabits per second” and refers to the speed of data transmission. Mbps is a fairly high rate of transmission.
- Native Files: Files typically used in discovery, such as Word, Excel, PowerPoint, email or other files used to create and share content. A native file is one that has not been converted to a display format such as PDF or TIFF.
- NIST: Stands for the National Institute of Standards and Technology, a government agency focusing on measurements, among other things. NIST publishes a list of the Hash values for commercially available software. The NIST list is used during processing to identify and remove program files that are typically not suitable for search and review.
- Production: The act of producing files to an opposing party for disclosure or in response to a production request.
- TB: Refers to a terabyte, which is the equivalent of 1,000 gigabytes (GB) or 1 trillion bytes of data.
- Technology Refresh: A reference to the periodic need to replace hardware (and sometimes software) because it has become out of date.