chevron-down Created with Sketch Beta.
July 01, 2015

“. . . and the courts have been utterly ineffective”

By Michael J. Saks and Ashley M. Votruba

Editor’s Note: A full version of this article that includes endnotes and charts is available from the authors, who can be reached at [email protected]and [email protected].

From a practical perspective, the challenge judges face in being gatekeepers of expert evidence is deciding on which side of a critical line a proffer stands. Is the proffered testimony “shaky but admissible” (and open to attack at trial for weight and credibility), or does it stand on ice so thin that it is likely to be more misleading to fact finders than helpful?

A number of forensic “sciences” have been especially difficult for judges to evaluate. At the root of the difficulty is something of a perfect storm: faith in these areas is unusually strong, while their empirical foundations (that is, the evidence that answers the question of how sound they are or aren’t) are unusually weak. Hollywood and popular fiction have persuaded generations that, when a “forensic scientist” asserts a conclusion, it will be flawlessly correct in the extreme. And yet, as a committee of the National Academy of Sciences (NAS), formed pursuant to a congressional mandate to study the state of the forensic sciences, concluded: “The bottom line is simple: In a number of forensic science disciplines, forensic science professionals have yet to establish either the validity of their approach or the accuracy of their conclusions, and the courts have been utterly ineffective in addressing this problem.” COMM. ON IDENTIFYING THE NEEDS OF THE FORENSIC SCI. CMTY., NAT'L RESEARCH COUNCIL, NAT'L  ACAD. OF SCIS., STRENGTHENING FORENSIC SCIENCE IN THE UNITED STATES: A PATH FORWARD 53 (2009) (“NAS report”), available at http://www.ncjrs.gov/pdffiles1/nij/grants/228091.pdf.

Cause for Worry

Four different sets of findings and occurrences give cause for concern about how well the courts are doing in their gatekeeping of these areas—and about the consequences of ineffective gatekeeping.

Deceased Forensic Sciences

In recent years, a number of forensic science techniques have been found to be so lacking in validity that they have been laid to rest—though never by the courts. Courts routinely admitted each of them despite a lack of any adequate (or any) foundational showing of validity. Courts did not recognize the severe shortcomings of these fields. Judges in early cases were content with bald assurances that what the witnesses had to offer was flawlessly valid. The more decades that passed without serious reexamination of those claims, the more sure successive generations of judges became that the claims stood on solid ground.

Voiceprint identification was the most controversial of these, admitted by some courts and excluded by others. Unfortunately, the controversy only underscores the gatekeeping problem. Every court that employed a narrow version of the Frye test of general acceptance (defining the “relevant field” as being limited to practitioners of voice spectrography) admitted the testimony. Every court that employed a broad version of the Frye test (defining the “relevant field” to include acoustical engineers, linguists, and statisticians as well as practitioners of voice spectrography) refused to admit the testimony. The Frye test can, after all, operate as a “take their word for it” test, and whomever a court is listening to is who will tell the court what to believe.

An NAS committee was assembled at the FBI’s request to review the scientific evidence. The committee found that the empirical evidence failed to support the claims of voiceprint examiners that they could dependably identify the source of a recorded voice. Consequently, the FBI stopped offering voiceprint witnesses to prosecutors, and the field rapidly declined toward extinction.

Another terminated field was comparative bullet lead analysis (CBLA). The CBLA technique, performed only by the FBI, claimed to be able to match a bullet to the “melt” in which it was created. (Each melt was thought to be uniquely different from all others, so bullets that shared a very similar metallurgical profile were thought to have all been created on one occasion in the same foundry.) Such evidence was used by the FBI to link a crime scene bullet and a box of bullets possessed by a suspect. An NAS scientific review committee reviewed the evidence for these claims and found them to lack validity. The FBI ceased performing CBLA.

A third field that offered courts junk forensic science without the courts noticing was arson investigation. For decades, its practitioners claimed to know how to distinguish a set fire from an accidental one. They had a long list of arson “indicators” on which they relied. Among these were burn patterns on floors, the angle of scorch marks on walls, spalled concrete, crazed glass, the color of the fire’s smoke, and a dozen others. The problem was that the arson indicators were based on nothing more than hunches about why some fires produced certain patterns that were absent from other fires. The arson field took a long time to take the next step: empirically testing whether set fires produced indicators that distinguished them from accidental fires.

Although those embryonic ideas had not been tested to determine their validity, they nevertheless were testified to as if they were established findings. Moreover, by allowing the testimony, courts inadvertently delayed serious testing of the field’s speculations. Eventually, however, the fire and arson field did begin to test its ideas empirically. To some condemned structures they set fires simulating accidental origins and to others they set fires simulating arson. Nearly two dozen of the supposed indicators turned out to indicate nothing: they did not distinguish arson from accident after all. Testimony based on those purported indicators started to disappear from the courts as word gradually spread among arson examiners. While the field of arson investigation has not been terminated, a large part of its corpus of speculative knowledge has been.

What remains troubling, however, is that courts were, and remain, unable to meaningfully evaluate claims of scientific facts and techniques—even when those claims have undergone no serious validity testing, and evidence law makes admission contingent upon precisely such validation. Which techniques that continue to be admitted will in the future come to be recognized as needing significant limitation if not termination?

Inability to Distinguish the Innocent from the Guilty

One might think that the case of a person who is actually innocent of a serious crime, but who is nonetheless prosecuted, would look different from the case of someone who did in fact commit the crime. Surely the evidence marshaled against innocent defendants cannot be as inculpatory as that assembled against guilty defendants. Surely judges can perceive some difference in the two types of cases.

Brandon Garrett sought to find out. He took a sample of 200 cases in which wrongly convicted persons were later exonerated by DNA evidence and released from prison. He paired each of those cases with a non-exoneration case from the same jurisdiction, near in time, charging the same crime. Whereas the first group of cases consisted of 100 percent innocent defendants, the second group consisted of all or nearly all guilty defendants. Garrett traced the history of each pair on its legal journey from initial filing of charges through post-conviction relief sought (not counting the final DNA-based motions), comparing the two sets of cases, looking for differences in courts’ reactions to the two sets. But no differences emerged. The legal process failed to perceive any differences between guilty defendants and innocent defendants being wrongly charged, prosecuted, convicted, and denied relief.

How can that be? That eyewitnesses make mistakes relatively often is old news. But wouldn’t forensic science evidence help separate the guilty from the innocent? It turns out that, after eyewitness errors, nothing creates more wrongful convictions than forensic science errors.

Forensic Science Errors and Erroneous Convictions

That post-conviction DNA testing has exonerated hundreds of innocent persons who had been erroneously convicted is well known. Less well known is what researchers found when they dug into the original cases looking for what in the seemingly inculpatory evidence led to the erroneous conviction. Forensic science errors (“identifying” someone who was not the source of crime scene evidence as being the source of crime scene evidence) were implicated in more of the cases than any cause other than eyewitness errors.

Seeing What One Wishes to See

Much of the information that forensic examiners are called on to scrutinize consists of images that are far from clear. Ambiguous images are an invitation for context (or observer) effects to push judgments toward what the observer expects or desires to see. This phenomenon has been tested in a wide range of settings over many years, and forensic examiners are no less vulnerable. In one well-known study, fingerprint experts were shown a latent (questioned) and a file (known) print they had concluded was an identification in their normal casework several years earlier. But this time each examiner was led to believe that the pair of prints being shown was from the FBI’s infamous error in a terrorist bombing case. But, they were each told, put aside that information and use your own expertise to decide if the prints match or not. Four of five experts changed their opinions on this second occasion: three from identification to exclusion, one from identification to inconclusive. Only one offered the same opinion on the later occasion that he had reached several years earlier.

Duty versus Inertia

Courts have a duty to screen expert evidence offerings to prevent misleading evidence from reaching juries. According to Daubert, “The overarching subject [of Rule 702] is the scientific validity and thus the evidentiary relevance and reliability [of] a proposed submission.”

The Frye general acceptance test is an alternative filter intended to accomplish the same result, but it does so by relying on the fields themselves to inform the court about what is valid and what is not. As noted above, how a court employs the Frye test—narrowly considering only what a field thinks of itself or broadly taking into account what other relevant fields have to say on the subject—can pre-ordain the outcome when asserted expertise that is weak or unripe or pseudo-science is being evaluated. When the question is the fundamental validity of a field’s major premises and techniques, asking the field for assessment is pointless. If astrologers were asked whether astrology is valid, do we have any doubt what they would say?

Screening expert evidence has been a problem for centuries. It has not yet been solved, and no easy solutions are in sight. Evaluating the knowledge and techniques of a field one is not intimately familiar with is a daunting challenge. Trying to solve the problem by changing the admissibility rule—from the one that prevailed before Frye, to Frye, to Daubert—has never been a panacea. Furthermore, some judges are unable to do the job well and other judges do not want to do it at all and find ways to avoid the effort, notwithstanding the law’s requirement that it be done.

Help is on the way. The American Association for the Advancement of Science has begun a “gap analysis” of a dozen different forensic sciences. What needs to be known to reach claimed conclusions is being compared to the knowledge base on which those claims rest. Any revealed gap will provide agendas for research, which will narrow the distance between assertions and actuality or establish which claims are unsupportable. But the process may take years. What are courts to do now?

What Can Judges Do in the Interim?

Some judges might wish to try harder to enforce the law on the books, admitting expert testimony if and only if it passes the test to which the law subjects it.

But recent history suggests that many judges will want to keep admitting such testimony with little or no scrutiny. These judges might nevertheless wish to impose limits on the admitted expert testimony in an effort to prevent the most misleading testimony from reaching jurors. We offer a number of suggestions in the service of that compromise.

Partial Admission of Testimony

Some courts have barred forensic examiners’ testimony on the ultimate issue of whether the defendant is the source of crime scene evidence, while allowing the expert to testify concerning observable aspects of the evidence. An examiner is allowed to testify about observable aspects of the evidence because the expert is in a better position to observe than a juror. However, if a field has not established its ability to make a dependable identification, the examiner is barred from offering an opinion on that issue. For example, a microscopic hair comparison expert would be allowed to testify as to the features of the questioned and known hair samples but would not be allowed to offer an opinion on whether the hairs share a common source.

Require That Examinations Use Blind Testing and Evidence Lineups

Blind testing is a common feature of many scientific fields, as well as everyday life (e.g., blind taste tests, blind grading). The purpose of blind tests is to remove inadvertent or deliberate bias. Courts could help protect information presented by forensic examiners from bias by barring results unless blind examinations are conducted. Blind testing would prevent biased reports and testimony because examiners would not know which results investigators expect or desire. When such test procedures are employed, results will be untainted by examiners’ desires or expectations.

For blind tests to truly work, evidence lineups or similar procedures must also be employed. Eyewitness lineups are accepted as superior to eyewitness show-ups. If only a single suspect were presented (as in a show-up), it is obvious who the suspect is and that investigators expect or desire that person to be identified as the perpetrator. Likewise, evidence lineups in conjunction with blind testing would be superior to evidence show-ups.

Require Accreditation of Labs and Certification of Examiners

Certification is one way that a field can try to enforce standards. For forensic sciences, the American Society of Crime Laboratory Directors/Laboratory Accreditation Board (ASCLD/LAB) inspects and accredits crime laboratories, at least partially ensuring more competent work. For individual examiners, certification programs of various types and quality exist. They also do not ensure that every examiner in every case follows proper procedures and reaches accurate conclusions. If courts require certification, however, the quality of work is expected to increase. Certification and accreditation programs can also require examiners to pass regular proficiency tests and labs to submit to routine scientific audits.

Require Experts to Testify Within the Bounds of What Is Actually Known

Though small improvements have been made in the wake of the NAS report, examiners in a number of forensic sciences regularly testify beyond what the science behind their fields can support, especially those engaged in pattern matching. One way courts could attempt to prevent misleading testimony would be to determine the limits of knowledge in a field and require examiners to testify within the bounds of those limits.

Recently, the FBI and U.S. Department of Justice formally acknowledged that for over two decades, in nearly every case where microscopic hair comparison evidence was offered, examiners gave scientifically unsound, exaggerated expert testimony. “The review confirmed that FBI experts systematically testified to the near-certainty of ‘matches’ of crime-scene hairs to defendant . . .” though no research supports such claims.

Given that one authoritative report after another makes clear that no basis exists for claiming unique individualization, courts could accomplish quite a bit by simply prohibiting testimony that asserts or implies that crime scene evidence has been linked to one possible source to the exclusion of all other possible sources. As fields gradually develop the ability to say more, they can gradually be permitted to testify to more—though it is nearly certain that the data-based testimony of the future will be more modest than what courts have been accustomed to hearing over the past century.

Provide Jury Instructions About the Field’s Limitations

Some courts have decided to admit evidence from problematic forensic sciences but to offset misleading influences by instructing the jury so as to put the evidence in proper perspective. For example, one court admitted handwriting identification evidence despite having concluded that the field had no basis for its central claims. The court explained to the jury that handwriting examiners are not scientists, but are more akin to craftsmen, and their testimony is less precise and less certain than the expert might assert or imply. The instructions were meant to help the jury gauge the weight of the expert’s testimony.

Use Court-Appointed Experts

Another option is the use of court-appointed experts and panels of experts to help courts decide whether and how much testimony to allow. All courts have authority to appoint experts or panels of experts through either court rules or their common-law powers. Some courts also have the power to appoint advisory juries. Such panels could assist courts in understanding the myriad forms of forensic science and any supporting actual science. By doing so, these experts and panels of experts could aid courts in deciding whether and which forensic science evidence to admit.

Facilitate Counter-Testimony

For the adversarial process to work, advocates on both sides of an issue need to be able to present their strongest case. Yet in criminal cases, typically only the prosecution has access to forensic examiners as expert witnesses. Criminal defendants rarely have experts to scrutinize the work done or the conclusions reached, to challenge the admissibility of the forensic evidence, or to testify at trial concerning the weight that should be given to the forensic evidence. Courts could help remedy this imbalance.

Conclusion

For the better part of a century, courts have been credulous observers of most types of forensic not-really-science, offered almost exclusively by one side, and which the opposing side had no practical means to challenge.

Perhaps no one should be surprised that such conditions produced what some have termed a “culture of exaggeration”—proponents making claims that far exceed the knowledge base on which their claims stand. The knowledge base remained stagnant because of what the NAS report described as a “paucity of research” combined with a severely limited research infrastructure. Thanks to the NAS report, the federal government has begun the long-overdue process of turning forensic science into actual science.

The challenge courts face is to figure out what they can do to better manage the forensic not-yet-science they are offered in the meantime.