As the volume of available data continues to grow at a rapid pace, the ability to effectively and succinctly tell a story about that data becomes increasingly important. This is especially true in a litigation environment, where effective communication of results may be the difference between a favorable or unfavorable finding by a judge or jury.
The Purpose of Visualizations
Descriptions of analysis and summary tables can convey a lot of information, but they often do not tell the story in a simple, intuitive, and meaningful way. It can be cumbersome to quickly identify patterns and trends (or lack thereof) across a large summary table. In contrast, visualizations have high data density, allowing more information to be shown within less space than either a data table or written description of the data. Visualizations also allow the reader or viewer to quickly ascertain relationships between different data points or to visually emphasize a pattern or trend of interest.
Consider three ways to disseminate information relating to a hypothetical analysis involving the Telephone Consumer Protection Act (TCPA) and alleged improper calls to telephone numbers.
Summary paragraphs encapsulate information in words. “My analysis analyzed 1,000,000 telephone dialer records and categorized the types of calls in seven unique categories: “Connected,” “Voicemail,” “No Answer,” “Hung Up,” “No Tone,” “Operator Intercept,” and “Do Not Call.” I found that there were 40,000 “Connected” calls 30,000 “Hung Up” calls, 20,000 “Voicemail” calls, and 10,000 “Do Not Call.” I treated these type of calls as connected calls. Therefore, I found that there are 9,900,000 calls, or 90% of all calls did not connect.”
Although we can determine what is happening in this analysis, it is difficult to easily interpret the information or make comparisons between categories in this form.
Summary tables categorize information. This is the same information but in a summary table. Using a short table makes the comparison between categories easier, but it still requires some explanation. The categories are much easier to compare, but it is still challenging to determine any trends in the summary. It is also cumbersome to compare the magnitude of the difference among the categories.
Data visualizations offer effective, efficient presentations of data. The graph immediately shows the difference between the categories. Furthermore, the graph provides all of the same information as the paragraph and the table but allows for a very simple and straightforward comparison between the categories. It also highlights the magnitude of the difference among the categories in an easy-to-see manner.
Elements of a Good Data Visualization
There are four key characteristics of a good data visualization:
- Observable: The graphic contains a fact or a trend that a layperson can see. Visuals should speak for themselves and tell a story that anyone can follow. If you need to spend extra time explaining the graph, it should be reconsidered.
- Objective: The graphic does not attempt to hide a fact or a trend, nor does it attempt to create one. Misrepresentations may be made through misleading or erroneous titles or through the scale of the graph.
- Original: The graphic contains cited, verifiable data sources and should stand on its own.
- Open: The graphic is clear and concise. Complicated graphs will be confusing, difficult to explain, and difficult to interpret. Simple is best!
Not All Graphs Are Created Equal
Bar charts aptly summarize information from a table. Bar charts have a variety of uses and are typically used when describing and summarizing data from a table. They are primarily based upon univariate (single-variable) data and typically sum up a categorical variable or show a percentage distribution. They are useful for showing comparisons among different categories and illustrating single-variable trends over time, and they are best for summing up and showing simple comparisons.
Bar charts are most effective if a couple of common mistakes are avoided. Consider the four graphs below, which depict the same underlying data, but note that the way in which the graph is constructed can add to (or take away from) its interpretation.
Figure 1 is cluttered with gridlines, and the reader must read text in two directions. Furthermore, the dark gridlines can distract the eye. As Figure 2 shows, you should lighten (or remove) the gridlines and rotate the text. This allows the visualization of the data to be clear to the viewer. In Figure 3, we rotate the entire graph. This shows the scale of the impact across the categories and allows for top to bottom viewing. Figure 4 adds color that is impactful and helps the viewer highlight the trend. It also adds additional data to the graph in the form of a percentage and adds a meaningful title. (Alternatively, we could leave the data unsorted to show variation in the categories.)
Pie charts are inferior to bar charts. Other than bar charts, line graphs, scatter plots, and pie charts are very common. Line graphs are great for trends across multivariate data that involves time and multiple categorical variables, for example, showing the number of calls by call type category over time. Scatter plots are excellent for showing relationships between two continuous variables, for example, the company’s revenue and the number of connected calls by year. Scatter plots can be used to highlight unseen patterns or relationships that exist between variables.
Pie charts are common but not very useful. They are difficult to interpret and difficult to compare. Your eyes are very good at comparing length (such as that depicted in a bar chart) but very bad at comparing volume (such as that depicted in a pie chart). As in the below example, trying to display all of the categories and all of the information that the bar charts easily convey turns into a jumbled mess. Your eye has a hard time judging the difference between slices or across different sets of pie charts. Even with sensible changes to the pie chart (such as removing 3-D, modifying the colors, direct labeling, and sorting the data), the bar chart is vastly superior in displaying the same information.
Data misrepresentations affect the accuracy of visualizations. It is very easy—too easy!—to create graphs that overstate, understate, or misrepresent fact patterns and trends in the data. Some of the most common methods for doing this involve changing the scale on a graph, graphing only part of the data, and transforming the graph or data using inappropriate methods.
Consider the following when reviewing or designing a graph:
- Is the scale of the y-axis appropriate for the data being displayed?
- Is all of the data being shown, or is the data being truncated in a way that hides a pattern or trend?
- Has the data been transformed or manipulated in some way that hides a pattern or trend?
- Has the graph been properly sourced and cited, and can it stand alone away from any presentation or writing that is next to it?
Accuracy affects admissibility of visualizations. Beyond aiding in the correct interpretation by the reader or viewer, the accuracy of visualizations in a litigation context could be the difference between the admissibility into evidence or exclusion of the visualizations. See In re Air Crash Disaster at John F. Kennedy Int’l Airport on June 24, 1975, 635 F.2d 67, 73 (2d Cir. 1980). Rule 1006 of the Federal Rules of Evidence provides that summary evidence and visualizations may be properly admitted when the following conditions are met:
- the charts “fairly summarize” voluminous trial evidence;
- they assist the jury in” understanding the testimony [or evidence] already introduced”; and
- “the witness who prepared the [visualization] is subject to cross-examination with all documents used to prepare the summary.”
United States v. Green, 428 F.3d 1131, 1134 (8th Cir. 2005); see also United States v. Boesen,541 F.3d 838 (8th Cir. 2008); United States v. King, 616 F.2d 1034, 1041 (8th Cir. 1980).
There has been some debate in other cases about whether all underlying documents must be in evidence or whether the visualization alone may be admitted as evidence. See, e.g., United States v. Janati,374 F.3d 263 (4th Cir. 2004); United States v. Jones,664 F.3d 966(5th Cir. 2011). While we will not discuss these cases or the debate in this article, we will note that in all cases the court has ruled that the visualizations must be an accurate summarization of the underlying documents and that the underlying documents must be admissible.
Data visualizations can powerfully impact how someone views and understands data. However, visualizations need to be carefully crafted to ensure accuracy in their representation of underlying facts and to have maximum impact and interpretability for the reader or viewer.
Navigant Consulting is the Litigation Advisory Services Sponsor of the ABA Section of Litigation. This article should be not construed as an endorsement by the ABA or ABA Entities.
Copyright © 2018, American Bar Association. All rights reserved. This information or any portion thereof may not be copied or disseminated in any form or by any means or downloaded or stored in an electronic database or retrieval system without the express written consent of the American Bar Association. The views expressed in this article are those of the author(s) and do not necessarily reflect the positions or policies of the American Bar Association, the Section of Litigation, this committee, or the employer(s) of the author(s).