From cell phones to police body cameras, today’s courts increasingly use video as evidence. A common assumption is that video can help people bear witness to an event as if they themselves were transported to the complicated scene of its unfolding. Rather than the secondhand testimony provided by eyewitnesses, video is assumed to provide an unmediated and firsthand account directly to the decision-maker. Video has thus been praised as a seemingly objective tool for justice and accountability. At the same time, video, just like eyewitness testimony, can be subject to a host of biases of which people are largely unaware.
February 20, 2024 Feature
Justice by Video: Do Courts Need Legal Guidelines for Video Evidence?
Sandra Ristovska and Yael Granot
Despite the potential for bias, U.S. courts, at all levels, lack clear guidelines on how video can be used and presented as evidence. As a result, courts can interpret video evidence differently, even within the lifespan of the same case. Scott v. Harris is a renowned U.S. Supreme Court case from 2007 that provided an early cautionary tale about the challenges of video evidence. In this oft-cited case, the court had to decide whether a police car chase, which left a driver paralyzed, violated the constitutional protection against unreasonable seizure. The car chase was recorded on two dashboard cameras. Lower courts ruled that reasonable jurors could differ as to whether the police had used unreasonable force to end the chase, requiring a jury trial. The Supreme Court, though, ruled 8–1 in favor of the officer, explaining that the case was “clear from the videotape” and that no reasonable juror could agree with the plaintiff’s account that the police had used excessive force to stop the vehicle. In an unprecedented move, the Court uploaded the video to its website, inviting the public to confirm that the video “speak[s] for itself.” Social scientists accepted the challenge and conducted an experimental study showing that people indeed had different interpretations about the reasonableness of the officer’s behavior depicted in the video. In fact, people’s perceptions differed depending on their cultural and ideological backgrounds, such as racial identity, education, income level, and political affiliation. The scientists who conducted the study thus stated that what a video says “depends on to whom it is speaking.” In other words, seeing is not only about what the eyes physically see but the experiences and ideas a viewer brings to an image.
Photographic and videographic images work through two forces, which visual communication researchers have termed denotation and connotation. Denotation refers to the literal meaning, while connotation describes how an image draws from the wider cultural context in lending meaning to what is depicted. When Justice Scalia famously compared the car chase in Scott v. Harris to a scene in the Hollywood movie The French Connection, he drew on the connotative level of meaning, couching the dashcam footage in a symbolic frame consonant with broader understandings of the world as facilitated by popular culture. The interplay between denotation and connotation, between the particular and the symbolic meaning of images, complicates the evidentiary status of video in court.
These two levels of visual meaning help us distinguish between what an image shows versus what a viewer infers from it. And what viewers infer from images is subject to bias from both endogenous factors (e.g., prior attitudes and group identities) and exogenous factors (e.g., features of the medium or the viewing context). Lab experiments, for example, have demonstrated the endogenous influence of identification with police on interpretation of videos of police–civilian altercations. Similarly, cultural worldviews predicted interpretations and legal judgments regarding videos of protest. On the other hand, exogenous influences account for why videos played in slow motion, compared to normal speed, resulted in judgments of greater intentionality in the depicted action.
The binary distinction between endogenous and exogenous influences is useful because it can facilitate the development of a taxonomy of the sources of systematic bias in appraisals of video evidence. This process can mimic work on eyewitness testimony; researchers categorized “estimator variables” like lighting conditions and distance, whose impact could only be estimated and weighted, versus “system variables,” such as lineup construction, which were more malleable to procedural intervention. Distinguishing where biased interpretations of video evidence come from can provide insight into when and how instructions, regulations, or other solutions could be necessary and effective.
In the examples we present below, we highlight three main issues identified in the literature that we think help capture the ways that interpretations of video evidence may be biased. First, people struggle with how to distinguish between relevant and irrelevant information, thus overweighting what they see. Second, people disregard the information they miss when viewing a video, thus underweighting what they do not see. Third, people are overconfident in their interpretations, thus lacking awareness of their own potential for bias. The seemingly direct, firsthand experience they feel they get from video evidence makes it particularly difficult to question their interpretations and subsequent judgments.
Overweighting: What Is Seen Is Salient
Video evidence makes certain information particularly salient, and viewers ascribe greater agency or causal force to the targets or elements that they perceive as most salient in a video. This phenomenon is potentially best exemplified by work on the camera perspective bias. Across multiple studies involving videotaped police interrogations, social psychologists found that the angle of the camera greatly influenced how people evaluated the confession they saw. If, for example, they saw only the suspect, who might have been sweaty or nervous, they were likely to think he appeared guilty, thus believing in the sincerity of his confession. If, however, people were given more visual information—such as when the angle of the camera included both the police officer and the suspect—people suddenly had another actor to whom they could ascribe causal agency. Researchers found that this wider angle led to reduced perceptions of guilt and confession sincerity. The angle of the camera directs viewers’ attention, often leading them to ascribe more agency, and sometimes blame, to the target they can see.
This process of overweighting can be so powerful that people may ascribe veracity to what they see, even when viewing something they objectively know did not occur. In one set of studies, participants sat in a room with a video camera and completed a computerized task. When they got a question correct, denoted by a green check on their computers, they were instructed to take money from a bank. When they got questions wrong, denoted by a red X on the screen, they could not take money. Researchers then had participants come in for a second session and accused them of taking money when they should not have. Half of the participants were simply told there was video evidence of them committing this misdeed, whereas the other half actually saw doctored footage of themselves committing the infraction. While most participants ended up signing a confession for the experimenter, only those who watched the doctored footage confabulated details to explain what happened, suggesting they genuinely came to believe they had committed the act. The weight of what they saw eclipsed their own knowledge of the facts.
In these ways, people may overemphasize the relevance of what they see, leading video’s salient features to have an outsized influence over legal judgments. Further, people may do this overweighting even when they have clear reason to question their own eyes.
Underweighting: What Is Missed Is Trivialized
Just as people may overweight the information they see, they also conversely underweight the information they miss. Information may be missed because viewers direct their attention away from it. However, sometimes viewers even miss things directly in their line of sight, a phenomenon known as inattentional blindness. In the most well-known demonstrations of this phenomenon, commonly called the “invisible gorilla” studies, over half of people missed a man in a gorilla suit walking through the middle of a ball-passing game between players in white and black shirts. This study was in part inspired by the real legal case of Boston police officer Kenneth Conley, who was involved in a foot pursuit. In the course of chasing a suspect, Conley ran by fellow officers beating a Black man who was later established to be an undercover officer. The court found Conley guilty of perjury when he claimed he had not witnessed the beating despite passing directly by it. Yet, research on inattentional blindness might explain how it is possible to miss dynamic actions happening within one’s field of vision.
Information may also be missed because the camera angle fails to capture it. Police body-worn cameras (BWCs) usually show very limited visual cues of the officer, as compared to dashboard cameras mounted on police cars, which tend to have a broader visual perspective. In one study, scientists showed participants the same police–civilian altercation captured by either BWC or dashboard camera. They found that body camera footage systematically decreased perceptions of police intentionality and culpability for the same actions relative to dashboard camera footage. The lack of visual salience of the officer prevented people from ascribing agency to him, potentially underweighting his role in the depicted event.
People may not realize they are engaging in underweighting because the brain often fills in the blanks, perceptually and cognitively, in such a way that people do not feel like they may be missing information when viewing images. In one set of studies, for example, participants saw incomplete images like a half-completed picture of a butterfly. On a subsequent memory task, participants were asked to recall what they saw and, when given a chance to choose between a full butterfly or the partial image they had in fact seen, they were more likely to misreport seeing the complete image. This perceptual “filling in” extends to real-world scenes. Researchers have identified a phenomenon called boundary extension, where people, informed by past experiences, misremembered images as having been more expansive than they were, potentially making up details that were never shown.
In other words, people might ascribe less weight to potentially relevant information if it is not presented or they do not attend to it. Further, missed information may carry such little weight because people fail to question its absence.
Overconfidence: What Is Perceived Is Certain
The tendencies to overweight and underweight may be particularly pernicious in that people do not realize their own vulnerability to these biases. Viewers are incredibly confident in their visual experiences. This confidence is so strong that they may trust erroneous visual information even when it is directly contradicted by relevant information from the other senses. In a classic psychology study, participants were asked to estimate the size of an object held in their hand, which they looked at through a size-distorting lens. Though their hands were giving them an accurate tactile perception, people’s estimates were more likely to err toward the incorrect information their eyes supplied.
People consider their visual experiences to be accurate and reliable even as they are willing to acknowledge others’ susceptibility to biases when viewing video. In considering video evidence of a police–civilian interaction, participants said they were less susceptible to bias than the average American, indicating that “If I’m paying very close attention to the event, I can prevent my worldview from affecting my understanding of [the video]” was more true for themselves as compared to other viewers.
Courts are not yet well equipped to address the problems that overconfidence creates. Some of the elements built into the legal process that might mitigate these biases do not seem to be effective. For example, one might imagine that the opportunity to watch video evidence multiple times throughout legal proceedings would undo any weighting errors that might occur upon first appraisal. In one study, however, researchers showed participants the same videotaped police–civilian altercation twice, using eye-tracking technology to measure their patterns of visual attention. Results revealed a visual confirmation bias, such that people seemed to systematically follow the same visual path they had already tread. Unless viewers are encouraged to gather new information, they might become even more certain about their initial interpretation.
Regulating Video Evidence
The challenges that overweighting, underweighting, and overconfidence create speak directly to the discrepancies between what a video shows and how a viewer interprets it. Judges, lawyers, and jurors operate within a legal framework that has not yet given images the same procedural scrutiny as it has given to words. When video is presented as an opportunity for those in the courtroom to witness with their own eyes the events at the core of a litigation, the consideration of the various influences on how people interpret images may get sidelined. Yet courts tend to see video as an objective record, partly due to the uncertainty regarding the classification of video as evidence and the insufficient attention to visual legal literacy. Together, these factors contribute to an inconsistent legal approach to video evidence.
Video’s classification as evidence is significant because it can influence how judges decide on video’s admissibility or how attorneys engage with the footage in court. The introduction of photography in the nineteenth century gave rise to demonstrative evidence as a category governing the use of images as evidence. Under this view, photorealistic images cannot prove facts on their own. They can merely illustrate what witnesses say or what other evidence shows. Demonstrative evidence, however, has always been an uncertain category ranging from illustration to proof. It remains an ambiguous term because both courts and legal scholars disagree about its meaning. Video can be used as an illustrative aid, but its role as direct evidence, capable of independently proving facts, is also common. Video has also always been subject to the “silent witness” exception when recording events no one else was around to see. At the heart of these shifting and ambivalent categories under which video can be admitted as evidence is an uncertainty about how to evaluate video’s probative value, or the degree to which it can prove the facts it is offered to prove.
The appraisal of video evidence is further complicated by cognitive and social factors that influence how people see and interpret images. Visual literacy training that addresses these factors is still not broadly represented in law school curricula and professional programs, though there are more trial advocacy programs today that teach how to interrogate video. Visual literacy can help judges and attorneys learn strategies for how to analyze evidentiary video—how to probe and ask relevant questions of the underlying content.
At times when generative AI and deepfakes threaten to exacerbate existing challenges with video evidence, we argue that safeguards for visual interpretation in court are urgently necessary. Short of clear guidelines for the use of video as evidence, judges, lawyers, and jurors are left treating video in highly varied ways that can lead to uneven renderings of justice. Research on guidelines that can optimize the overall consideration of video as evidence is evermore important for promoting consistency and fairness for those who use the courts.