chevron-down Created with Sketch Beta.
February 20, 2024 Feature

Constructing Meaning in Video Evidence

Christina Spiesel and Neal Feigenson

People intuitively believe that video puts them directly in touch with what really happened. They tend to think that the camera objectively records what is unfolding in front of the lens and that the resulting video looks like what they would have seen had they been there. Of course, video never just shows us what happened. And it never just “speaks for itself,” as Supreme Court Justice Antonin Scalia famously incanted in Scott v. Harris, where the Court accorded dashcam video the status of conclusive proof. Video is often incomplete, often ambiguous with respect to the issues in dispute, sometimes misleading, sometimes in conflict with other video, and almost always open to reasonable differences in interpretation. Given the proliferation of surveillance cameras and now with cameras on every cell phone, there may also be multiple videos of litigated events, many of them posted to the internet, where even more people may try to decode what has happened.

Understanding what video means, what it is and is not capable of providing evidence of, always demands looking inward toward the medium itself. Sometimes critical information can be elicited only by scrutinizing the individual frames; sometimes the crucial fact occurs between the frames, so it’s not actually in the video at all but implied by what comes before and after. Understanding video also requires looking outward toward the contexts in which every video is embedded. Advocates and witnesses provide verbal context; other images, including other video, can provide visual context. Videos also carry cultural associations of all sorts, which inflect the meanings that legal decision-makers take away from their viewing (and listening) and can lead lawyers to make assumptions, right or wrong, about how others will interpret what they are seeing. We illustrate these general points with six observations based on specific cases.

Presenting Still Photographic Evidence

We begin with still photographs because video is made up of still frames strung together. In Kennedy v. Bremerton School District, a high school football coach made it a practice to pray after games at the 50-yard line with members of both teams. He claimed that the school district had fired him for violating its policy against public religious practice by public employees while on the job. When the case came before the Supreme Court, Justice Gorsuch, writing for the majority, described the coach’s actions as offering “a quiet prayer of thanks.” Justice Sotomayor, dissenting, wrote: “This case is about whether a school district is required to allow one of its employees to incorporate a public, communicative display of the employee’s personal religious beliefs into a school event.” She included in her opinion three photographs from the record, visual evidence that Justice Gorsuch did not address. Justice Sotomayor sets the visual text in dialogue with the written text, almost mano a mano, inviting us to see for ourselves.

The first photo from her opinion shows the coach surrounded by his team (and possibly players from the opposing team) while raising two helmets up high. This certainly seems to refute the majority’s description of the prayer as “private” (it is hard to see the people in the stands who are watching) and perhaps also “quiet” (although the photo doesn’t have sound). What else might it signify? Are the raised helmets trophies? Objects of veneration? Does raising them elevate the boys who presumably wore them as having engaged in a kind of sacred activity?

Many viewers will be aware, if only vaguely, of the deep-seated ties between religion, specifically Christianity, and American football, as evidenced by phrases like the “Hail Mary” pass and the “Touchdown Jesus” mural on the Notre Dame University campus. Many of them would also bring to the picture some awareness that the 50-yard line is a special place on a football field—not just because it’s center stage, but also because it often features a logo associated with the home team and hence the home institution. For other viewers, the photo would prompt none of these associations; they might just enjoy football or know little to nothing about it. The point is that what the photo means to viewers can be as varied as their divergent views about whether this sort of prayer by a public employee on school grounds ought to be allowed. Advocates face the challenge of deciding which cultural community they most need to address. If the case teaches us anything, it is that how to decide that is nonobvious.

One last observation. The original photo that was part of the trial court record is in color. Does the black and white photo in Justice Sotomayor’s opinion reduce its emotional intensity and bring the image closer to writing, making the markings on the field and the numbers on the uniforms of the players more salient—and if so, might that distract our attention from the issues before the Court? And how might the black and white image affect the public audience for the opinion?

Viewing Video Frame-by-Frame

Sometimes information in a video can be discovered only if we watch it frame by frame. The current standard for full-resolution video is 24 frames per second, but that is just what the monitor sees and the timecodes indicate; it’s not what our eyes see. People can respond to stimuli in fractions of a second, but not that fast, and they respond more slowly to visual stimuli (about a quarter of a second) than to auditory or tactile ones. The important message here is that human eyes can miss the details in video whizzing by at 1/24th of a second per frame—not to mention that important things that happen between frames are not in the video at all.

Lawyers who need to explore a piece of video for its full evidentiary value have to upload it into video editing software, which may enable them to get more information out of it than they can by watching the more lifelike, fast-moving stream. One of us (Spiesel) was asked to explore a piece of body-worn camera video in a case involving police officers’ use of excessive force. The one video submitted for analysis was 6½ minutes long—over 9,000 frames. The officer whose video was submitted was the second to arrive at the scene. Did the other officer not have a bodycam? If he did and he turned it on, what happened to his data? It is reasonable to assume that the plaintiff’s lawyer was looking for evidence of excessive force, but the review of the video was also looking for evidence that the bodycam data may have been altered.

The video begins with the sound of everyday communications between the second officer and the dispatcher while the officer is driving to the scene. Upon arrival, the officer moves to the suspect’s parked vehicle, removes him from the car, and puts him on the ground. The direction of the camera constantly and rapidly changes—the bodycam was attached to the officer’s torso and moved as he did—confusing what can be seen. The officer who was already present joins in. There is a lot of tussling as the suspect ostensibly tries to get up. Some frames show one and then both officers putting pressure on the suspect’s neck with their hands. At least part of that time, we can infer from the camera angle that the officer wearing the camera is straddling the suspect’s hips. Exactly one frame in the whole vast stream of pictures shows the right hand of an officer making a fist, which seems to be hanging from above and aimed directly at the camera. But there is no frame showing the fist striking the suspect. At the end, the suspect is shown walking, with assistance, away from the scene.

The video leaves us with many questions. Why do we not see the suspect getting up and starting to walk? How can we interpret very fast changes in the positions of the officers, both of whom appear to be well-muscled guys? Just how much time does it take for them to change positions? Are any frames missing?

And why do we not hear the officers’ voices when they seem to be communicating with one another, although we cannot see their mouths moving in speech? Police cruiser sirens are screaming in the background throughout. Were they really on the whole time? It seems improbable that two men working in close quarters and close to the recording device, coordinating with each other, cannot be heard even with the ruckus in the background. Examining the soundtrack in a video editing program reveals the dense siren, which masked any sound waves from conversation. This prompts one last observation useful to attorneys (and others): In a video editing program that shows the frames, audio is a picture—a waveform—and not only a sound. Those waveforms can be very revealing (as well as a tool for precisely editing sound, should that be desired). It takes patience to view all those still frames, but alert observation can discover things otherwise unseen.

Verbally Framing Video Viewing for Jurors

When video material like this does go to court, it is seen in the verbal contexts the lawyers give it. We turn now to experimental evidence indicating that the words advocates use to frame jurors’ viewing of a video can affect what jurors think they see.

In State of Ohio v. Tensing, a police officer pulled a driver over for not displaying his front license plate in the proper place. After the driver did not produce his driver’s license, the officer asked him to step out of the car. When the driver refused and instead began to start the engine as if to leave the scene, a brief struggle ensued, and the officer fatally shot the driver point-blank in the head. The crucial issue in the officer’s murder trial was whether he was justified in doing so because, as the driver started up the car, the officer reasonably believed his life was in danger. And that largely depended on a factual question: Was the officer in harm’s way, about to be run over, when he fired his weapon, or did he shoot the driver first? It’s hard to determine this from the officer’s body-worn camera video. Everything happens fast: The officer’s own movements create motion blur in the images, the fisheye lens distorts the view, and it is difficult to sync the sounds on the video to what can be seen.

One of us (Feigenson) has been conducting a study in which mock jurors, before watching the actual video from this case, read opening statements from the prosecution and defense attorneys, in which the lawyers tell them (or not) what to look for in the video. The study has found that verbally framing mock jurors’ viewing by instructing them on what they will see affects what they report having seen. For instance, when the prosecutor said, “Officer Tensing was already pointing his gun directly at the driver’s head before the driver reached his right hand to put the key in the ignition,” mock jurors were significantly more likely to say that that’s what they saw than when the prosecutor did not tell them to look for this. What is particularly noteworthy about this example is that, based on the video itself, the prosecutor’s statement is unambiguously false: The driver’s hand is clearly holding the key in the ignition well before the officer pulls out his gun. Thus, verbally framing the video can lead viewers to think they saw what didn’t happen, as well as incline them to resolve in the advocate’s favor what the video leaves more ambiguous. This is one very important way in which advocates can try to make video evidence speak as they want it to.

Using Multiple Videos to Tell a Story

On October 1, 2017, Stephen Paddock used a hammer to break windows facing the street in his two suites on the thirty-second floor of the Mandalay Bay Hotel, opposite the Harvest Country Music Festival, then in full swing. At 10:05 p.m. he began shooting through those broken windows, resulting in the largest massacre carried out by a single gunman, leaving sixty people dead and over 800 injured. Paddock died of a self-inflicted gunshot wound before authorities could reach him. Had he lived, there might well have been a trial and a public accounting, perhaps even a satisfactory explanation for why he did it.

Victims and survivors sued the hotel, claiming that it bore some responsibility for allowing Paddock to accumulate, over the course of a week, the arsenal he used in the massacre. Video of Paddock’s movements in and out of the hotel’s public spaces was captured on several surveillance cameras. The case thus offers an opportunity to explore how multiple video streams may be deployed to tell a compelling story. Our discussion is based on the New York Times video reporting, which includes security camera footage of Paddock from his checking in on September 25 to the terrible night of October 1, as well as diagrams and other documentary images. The video segments show Paddock, an unostentatiously dressed, 60-something white man, calmly entering the hotel over and over again, carrying various suitcases and satchels, assisted by porters and others, with whom he chatted and whom he tipped. He repeatedly enters elevators with a lot of luggage. No one seems to be taking any special notice.

The lawsuit settled, but we can ask: If something like the Times video compilation had become part of a legal argument, which side would it serve? For instance, does the video help the plaintiffs make the case that hotel employees should have foreseen that this well-behaved guest, who (based on past visits) seemed his usual self, was planning mass murder? Or would it help the defendant more? Paddock certainly did not fit the now-common stereotype of the out-of-control angry adolescent or early adulthood shooter. He looked like a middle-aged man on a gambling trip, the sort of person the hotel was used to accommodating. Could the hotel have put the developing story together at the time from reviewing its own security footage? It is doubtful that Paddock would have stood out in the many lower-quality video feeds that guards would have been monitoring. If anything, the video provides a kind of character evidence: Paddock appeared to be meticulous and aware of his environment and related equally to men and women as he encountered them. All of this might be useful to the defendant, but it emerges only from seeing the thirty or so video bits knitted into this documentary.

Increasingly, advocates have access to much more video than was imaginable even ten years ago. How are you going to be sure you have it all? And when you do, how are you going to make sense of all of it to tell the story of your case? Be prepared to be a bit of a film director.

Turning a Video into a Multimedia Presentation

Lawyers will often want to go beyond integrating multiple videos and using their own words to frame the audience’s viewing. Everyone saw Darnella Frazier’s video of the killing of George Floyd. One might have been tempted to think that just seeing the video would have been enough for jurors to convict Officer Derek Chauvin of murder. But prosecuting a police officer, especially for murder, is never a slam dunk. How can one prove beyond a reasonable doubt that Chauvin caused Floyd’s death “by perpetrating an act . . . evincing a depraved mind, without regard for human life” (part of the third-degree murder charge)?

The prosecution team did not let the Frazier video speak for itself. They wove it into their theory of the case, both visually and verbally, each persuasive tactic reinforcing the others. Here are just a few examples. In their opening statement, before playing the video, prosecutors displayed a single frame taken from it, showing Chauvin kneeling on Floyd’s neck and glaring at the camera. Of course, the Frazier video (and others) would be central to the case, but prosecutors knew the value of giving jurors a single emblematic picture to hold in their minds, and they showed it repeatedly throughout the trial.

The prosecution also incorporated the same still frame from the video, as well as other images and text, into a timeline of the crucial events that probatively summarized their narrative. Among other things, by placing the representations of Chauvin’s conduct across the top of the slide, above but separated by empty background from its effects on Floyd below, prosecutors visually emphasized the psychological distance between the two, and hence Chauvin’s disregard for everything that was happening that should have made him realize, as the minutes went on, that he was risking serious harm to Floyd.

In closing argument, prosecutors emphasized the moral dimensions of the video evidence. They valorized the bystanders’ actions of videorecording the crime in the religious language of “bearing witness.” And they described the bystanders’ videos as “precious” gifts, which the bystanders “gave to you,” the jurors, which—according to anthropologists who have studied cultures of gift exchange—imposed on the jurors the obligation to reciprocate by rendering the desired guilty verdict. Thus, with video as the focal point, prosecutors constructed a multimedia presentation combining different videos, still images, and diagrams, as well as words, to hammer home their theory of the case.

Citing Images in Ads, Video Games, and Social Media

Vast as the world of video evidence is, lawyers need to be aware of an even wider visual environment, including video games and social media, both as potential sources of compelling visual evidence and also for familiarizing themselves with the cultural surround of their cases. After the 2012 massacre at the Sandy Hook Elementary School in Newtown, Connecticut, the parents of several of the murdered children sued the manufacturer of the assault rifle the killer used. The plaintiffs’ lawyer would need to prove that the defendant’s wrongful marketing of the weapon causally contributed to the massacre. He was, of course, prepared to show the defendant’s ads. But he also used imagery from video games like the one the killer spent countless hours playing to show that the defendant used product placement to put realistic renditions of its assault rifles in those boys’ virtual hands—and, in juxtaposition with crime scene photos, to connect that advertising with the crime. This computer- and culture-savvy plaintiffs’ lawyer knew where to look for images that would help his case and knew what to do with those images when he found them. As advocates try to persuade decision-makers and the public at large, being aware of the sorts of images that are out there in mainstream culture is more important than ever before.

The presumptive credibility of video is now under attack. Computers can generate photorealistic images of things that never appeared in front of any lens. In the words of W. J. Mitchell, “the referent has become unstuck,” and with it, the axiomatic reliability of video that derives from its indexical relationship to the reality it depicts.

While the availability of deepfakes may make the authentication of video evidence more laborious and sometimes more contested, that evidence will continue to preoccupy legal decision-makers. And for that reason, it remains imperative for advocates to appreciate what video evidence can mean and can be made to mean. Being visually literate depends on closely observing what is actually there before the eyes, and then thinking carefully about how what is there, in all of its many dimensions, can help (or hinder) the advocate’s persuasive aims.

    The material in all ABA publications is copyrighted and may be reprinted by permission only. Request reprint permission here.

    Christina Spiesel

    Yale Law School

    Christina Spiesel is senior research scholar in law and affiliated faculty fellow at Yale Law School’s Information Society Project, co-author of Law on Display: The Digital Transformation of Legal Persuasion and Judgment, author of numerous chapters and articles, professor of visual persuasion, and a legal visual consultant.

    Neal Feigenson

    uinnipiac University Law School

    Neal Feigenson is a professor of law at Quinnipiac University Law School. He writes about the uses of visual media in legal practice and has authored or co-authored three books, including Law on Display: The Digital Transformation of Legal Persuasion and Judgment and Experiencing Other Minds in the Courtroom.