chevron-down Created with Sketch Beta.

Litigation Journal

Fall 2020: Standards

Proof at the Salem Witch Trials

Leonard M Niehoff


  •  “What everybody knew” in Salem made it effectively impossible for defendants to refute the charges against them.
  • We should consider how “what everybody knows” continues to shape our law and our trials.
  • In the years to come, some of our presently unflagging convictions will doubtless be viewed as embarrassing nonsense.
  • Numerous examples of this exist with respect to scientific evidence.
Proof at the Salem Witch Trials
Kech via Getty Images

Jump to:

As of the writing of this article, President Donald Trump’s tweets have included roughly 400 references to “witch hunts.” In a sense, this is unsurprising. The Salem witch trials have a special place in our national identity and vocabulary. Most Americans understand the reference, even if they know few of the historical details. And the phrase “witch hunt” serves as a useful shorthand for any frenzied chase after something that does not exist. The Salem trials also inspire a peculiar fascination: Perhaps no other site of deadly mass hysteria has become a major tourist destination.

Still, most practicing litigators probably know very little about the Salem witch trials. That’s a shame because the Salem proceedings have a lot to teach us. They offer countless insights into the significance of a stable and impartial judiciary, the indispensable place of legal counsel, the critical role of procedure, and—most importantly for purposes of this article—how the concept of proof can go terribly wrong. As tends to hold true with Salem’s lessons, these are mostly cautionary tales.

Historical Context

Permit me to start with some historical context, because that does a great deal of work here. Americans often indulge in a kind of “Salem exceptionalism,” treating the events of 1692 as if they were an isolated and idiosyncratic departure from the long arc of human affairs. Countless books have tried to explain why the Salem witch craze happened, as if it were an aberration, pointing toward everything from group psychosis to frontier stress to hallucinogenic yeast.

In fact, witch hunts and trials had been going on in Europe for hundreds of years before 1692, with consequences of vastly greater magnitude. Nineteen accused witches were executed in Salem. The witch hunt in Germany sometimes saw more than 100 accused witches executed in a single day. And Salem pales in comparison with the mass prosecutions and executions that occurred in France and Scotland.

Common-law countries value precedent, and Salem had an abundance of it—not just across Europe but across centuries. Granted, by 1692, the craze had begun to fizzle out on the other side of the Atlantic, but it had by no means ended. The folklorist George Lyman Kittredge found nothing even remotely strange in the trials that happened in Salem. To the contrary, he declared, it is “inconceivable that the Colony should have passed through its first century without some special outbreak of prosecution—inconceivable, that is to say, to one who knows what went on in England and the rest of Europe during that time.” George Lyman Kittredge, Witchcraft in Old and New England 367 (1956).

As generally held true in Europe, the Salem trials hatched out of a period of political upheaval. In 1629, the Crown had issued a charter that, among other things, allowed for the creation of a general court. For reasons we do not need to explore here, England vacated the charter in 1684, so that by 1692 the Village of Salem was effectively operating without any regular form of government. When, in May of 1692, William Phips, the newly appointed governor of the Massachusetts Bay Province, returned from England, he found the fragmented political and judicial systems overwhelmed by accusations of witchcraft and the colony’s jails overflowing with suspects. He needed to do something quickly.

Phips had risen from poverty to become, in turns, a shepherd, a ship’s carpenter, a sea captain, and finally a successful fortune hunter who achieved immense wealth. He had proved himself scrappy and resourceful but, unfortunately, had no background in the law. So, largely borrowing from the English model, he created the Court of Oyer and Terminer—literally meaning “to hear and determine”—to address the dire circumstances he encountered. Phips would later regret this decision and would dissolve the very court he had established.

The judges appointed to preside over the trials were a mixed lot at best. Three of them (Chief Judge William Stoughton, Judge John Richards, and Judge Wait Winthrop) enjoyed close friendships with the clergyman Cotton Mather, one of the prime movers behind the witch hunt, and attended his church. Mather dedicated one of his books to Winthrop, and Richards consulted with Mather about the significance of evidence offered at the trials. One of the judges, Nathaniel Saltonstall, became so disillusioned after the first trial that he left. Judge Samuel Sewell persisted in the work, but years later wrote an impassioned confession of his error.

The law that these judges applied was hardly a masterpiece of clarity and due process. Echoing a biblical passage from the book of Exodus, the Massachusetts law starkly declared that “[i]f any man or woman be a witch (that is hath or consulted with a familiar spirit) they shall be put to death.” The judges of Oyer and Terminer were directed to apply this law in deciding cases and also, rather mysteriously, to proceed “according to the law and customs of England.”

The Court of Oyer and Terminer held witch trials on four occasions during 1692, with most sittings spanning a period of several days. The court could conduct multiple trials over the course of each convening because the proceedings moved at a dazzlingly fast pace, often lasting little more than an hour. Their brevity is not, however, the only reason that most litigators today would struggle to think of these events as constituting what we call a “trial.”

The proceedings usually began with a plea from the accused, with the expectation that a defendant who claimed innocence would also openly acknowledge the court’s authority to adjudicate the matter—a sort of admission of jurisdiction. This did not always play out as expected, as in the case of the cantankerous Giles Corey, whose wife Martha had also been accused of being a witch. When Corey declined to make such a concession, the court attempted to extract his cooperation through a punishment that entailed placing more and more stones on his body until he relented. The story goes that Corey defiantly called for “more weight,” which his tormentors provided until they finally crushed the life out of him. Arthur Miller’s play The Crucible offers us a grim recounting of the scene.

After the plea, jury selection ensued. It bore some resemblance to our own: The process started with a pool of 48 men, from which 12 were selected. The accused apparently could question the jurors and challenge them for cause. Once the jury was seated, the prosecutor would commence with the introduction of evidence.

Depositions and Hearsay

The evidence admitted at these trials usually followed the accused there from earlier proceedings. At the preliminary stages of the case, evidence often took the form of “depositions,” written statements from purported witnesses. Ironically, judges tended to prefer such statements over live testimony because they thought them more reliable. In an era that did not have available any easy and trustworthy form of creating verbatim transcripts of oral testimony, a written document seemed more dependable and avoided arguments over what the witnesses had said. The depositions admitted during preliminary hearings were commonly readmitted during trial.

Today, we would swiftly reject such evidence as a flagrant violation of the hearsay rule, but in 1692, that doctrine had not yet fully evolved. The prosecution’s use of multiple out-of-court statements during the infamous treason trial of Sir Walter Raleigh in 1603 helped spur the development of the hearsay ban. (Interestingly, 1604 marked the passage of the most draconian of the several English witchcraft statutes; it was a bad time for justice and due process.) But the hearsay doctrine evolved slowly and did not take on something like its modern form until the early 1700s.

The trials at Salem remind us of why we have a hearsay rule and why we need to proceed cautiously in taking steps that might weaken it. In this regard, it is sobering to note that the version of the hearsay doctrine that currently appears in the Federal Rules of Evidence is subject to more than 30 exceptions and exclusions, including a catchall exception with ominous potential that is thankfully unrealized. Still, evidence admitted under our diluted rule is vastly more reliable than the evidence allowed at Salem, which was for several reasons about as rank as hearsay gets.

To begin, the depositions used in these trials included not only the out-of-court statements of people who had personal knowledge about the subject matter of their testimony, which would have been hearsay enough. Rather, as Salem archivist and historian Richard Trask observes, they also included “second-hand rumors” and “fits of fancy.” See Richard B. Trask, Legal Procedures Used During the Salem Witch Trials and a Brief History of the Published Versions of the Records, in Records of the Salem Witch-Hunt (Bernard Rosenthal ed., 2009). In reviewing some of these depositions, a reader might even struggle to discern precisely how many layers of hearsay they involved.

Then there are issues of timing and preservation. Even today, we sometimes view statements written outside of court as trustworthy because they were prepared during or shortly after the events in question, before memories had a chance to fade and distortions had a chance to set in. Indeed, Federal Rule of Evidence 803(5) includes an exception that applies to a witness’s earlier written observations—called “past recollection recorded”—for precisely this reason.

As Trask observes, however, a close review of the Salem documents reveals that many of them were not composed at a single point in time. Instead, they were revised and supplemented with additional text as the proceedings unfolded. It therefore appears that these documents were not so much fixed snapshots of a witness’s knowledge as they were evolving narratives that changed with the prosecution’s theory of the case.

Then there is the matter of who prepared these documents. According to Trask, handwriting analysis suggests that Thomas Putnam wrote out many of the depositions of accusers and other witnesses. No one could characterize Putnam as a disinterested and objective scribe. The earliest accusers in Salem included his wife Ann and their 12-year-old daughter. Thomas himself was the complainant in dozens of cases and testified in 17.

We have a hearsay rule because of concerns about the reliability of out-of-court statements, and for the reasons discussed above, the depositions offered in Salem scored an unreliability hat trick. In many cases, they came in an unreliable form, prepared in an unreliable manner, written by an unreliable scribe. Of course, not all were equally suspect. But few, if any, would satisfy the evidentiary standards of today, and many would fail for a host of reasons.

Presenting Evidence

The evidentiary presentation at the Salem trials usually started with a reading of the depositions made by the various witnesses against the accused. Granted, those witnesses might appear in person so they could swear summarily that their statement was true, but this sort of trial-by-endorsed-hearsay offered no greater assurances of reliability. After all, under the procedures of Oyer and Terminer, the accused had no right to cross-examine the people who had signed the statements against them.

Some live witnesses did testify substantively, most importantly the defendant, who enjoyed no privilege against self-incrimination. In this singularly lopsided system, the prosecutor could cross-examine the accused or anyone who came to his or her defense. The prosecutors, particularly the notorious John Hathorne, were for the most part highly skilled and effective at their jobs.

Consider, for example, Hathorne’s cross-examination of Martha Corey. At one point, Hathorne asked her: “Were you to serve the Devil ten years? Tell how many?” American Studies scholar Katherine Howe—herself the descendent of three Salem witches—notes the trap that the question sets: If the witness says yes, then she has conceded a decade-long pact with Satan; if she says no, then the prosecutor will ask how many years she did agree to serve. Perhaps sensing her insoluble dilemma, the witness responded by laughing. See The Penguin Book of Witches 272 n.18 (Katherine Howe ed., 2014).

Or consider the cross-examination of Martha’s husband, Giles, whose horrible fate was described earlier. The prosecutor (probably Hathorne) asked Corey: “What temptations have you had?” Corey proudly responded: “I never had temptations in my life.” Hathorne followed up: “What, have you done it without temptations?” As Katherine Howe points out, with this question Hathorne craftily transformed a claim of innocence (“I’ve never been tempted”) into a stunning confession (“I made a deal with the Devil even without being tempted into doing it”). Id. at 275 n.5.

It appears that no denial could extricate an accused from Hathorne’s cross-examination tricks. At one point in her questioning, alleged witch Bridget Bishop blurted out: “I know nothing of it. I am innocent to a witch. I know not what a witch is.” To which Hathorne calmly replied: “How do you know then that you are not a witch?” Id. at 168.

Character Evidence

Hearsay is not, however, the only category of evidence that we now generally ban but that found a welcoming home at the Salem witch trials. The court also accepted evidence about the bad or suspicious character of the defendant. The evidence came in all forms (reputation, opinion, and allegations of specific acts) and recounted everything from unpleasant personal interactions to vicious rumors. A strong confirmation bias helped move things along: Many of the accused were, for one reason or another, socially marginalized, and they probably became the target of a witchcraft charge precisely because of their outsider status.

This focus on character made a perverse kind of sense. The early English witchcraft acts had primarily concerned themselves with maleficium—the harm that the alleged witch had supposedly done to the victim’s person or property. Those statutes largely treated witchcraft as just another crime and viewed as relatively incidental the question of whether the accused had accomplished it via arson, poison, or a curse. The 1603 statute, however, shifted its attention toward the status of the accused and more plainly treated as criminal the simple act of being a witch.

The colonial statute, quoted above, followed this model. It technically did not require proof that the accused had used witchcraft to hurt anyone physically or to damage their possessions; the crime consisted simply of being a witch who consulted with familiars. Of course, as a practical matter, the proofs usually included some evidence of harm because that is what prompted the initial complaint and got the ball rolling. But the statute that controlled in Salem made character the centerpiece of the case, so evidence of it was highly relevant at trial.

Indeed, it could be argued that, of all the evidence principles that caused trouble at Salem, relevance did the most mischief. To understand why, we need to remember that trials do not occur in a vacuum and that no trial purports to build up a universe of realities from nothing, like an act of divine creation. As the prominent legal scholar Carl Thayer observed, “[t]he judicial process cannot construct every case from scratch, like Descartes creating a world based on the postulate Cogito, ergo sum.” Fed. R. Evid. 201 advisory committee’s note.

To the contrary, trials take place against a backdrop of factual understandings that are generally shared among the members of the community. In this sense, we conduct trials within the context of “what everybody knows.” This holds true today when, for example, everyone on a jury has certain basic knowledge about things like cars, household appliances, and medical care. And it held true in 1692, when everyone on a jury had a collective elementary understanding of things like how someone became a witch, how witches did their evil work, and how the diabolical creatures could be identified.

Thus, in 1692, the people of Salem Village knew that someone became a witch by entering into a compact with the devil, who often appeared as a darkly dressed man. They knew that witches had at least one mark on their body. The devil might leave one at the time the witch agreed to serve him, or the witch might grow a small nipple to feed her “familiars” (the cats and other creatures who did their bidding), or both.

They knew that witches were often seen in the presence of their familiars. They knew that witches could change shape, could transport themselves through the air, and could appear in spectral form to their targets. They knew that witches used dolls (sometimes called “poppets”) to work their curses. They knew that a witch could not recite the Lord’s Prayer without stumbling. And so on and so on.

By the end of 1692, serious doubts had emerged about the trials, leading ultimately to the dissolution of the Court of Oyer and Terminer. But historians generally agree that this skepticism related to the efficacy of the trials in identifying witches with the certainty appropriate to a capital case. Even after the trials ended, people continued to believe in witches and in the attendant signs, like marks and familiars and apparitions. “What everybody knew” about witches was stubbornly fixed and remained so for some time.

Those beliefs made relevant a wide range of evidence that, with the hindsight of our 21st-century eyes, seems utterly meaningless. Today, we would find it wholly unremarkable that someone would have a mark on her body, or had been seen in the company of a man in dark clothes or a pet, or kept dolls around the house, or struggled to recite the Lord’s Prayer perfectly when her life depended on it (especially if she were illiterate or were not fluent in English, as was true of some of the accused). In the Salem trials, however, all of these facts had a grotesquely outsized significance.

But it gets worse, and in two ways. First, because of the belief that witches could appear in spirit or spectral shape to the cursed, accusers were allowed to testify to their dreams and visions. The use of “spectral evidence” led to a controversy, with Cotton Mather defending it and critic Robert Calef harshly condemning it. Calef made out a withering indictment of the practice, and Mather responded by burning Calef’s book in Harvard Yard.

Second, because of the prevailing demonology of the day, the absence of these facts did not necessarily tend to exonerate the accused. The devil might appear as a dark man, but might also manifest as a small boy or an animal, so testimony that the accused had been seen in the presence of pretty much anyone or anything pointed toward guilt. The lack of a visible mark on the body of the accused might mean that the devil had helped conceal it or that the witch had allowed the nipple to dry up to avoid detection.

A particularly striking example of the difficulty of trying to offer exonerative evidence comes in the case of the Lord’s Prayer. One of the accused witches was, ironically, the former pastor to Salem Village, the Rev. George Burroughs. The prosecution failed to offer many of the conventional proofs against Burroughs—for example, that he had the requisite mark on his body. The jury nevertheless convicted him and sentenced him to hang.

While Burroughs was in the process of being executed, he recited the Lord’s Prayer without hesitation or error. This development gave the crowd that had gathered some pause. But Cotton Mather dismissed their concerns by pointing out that Burroughs had been duly convicted and that the devil had often deceptively appeared as an angel of light. Mather’s argument must have carried the day, because four more executions followed.

In short, “what everybody knew” in Salem made it effectively impossible for defendants to refute the charge, because no set of facts would tend to show their innocence. An accusation thus led ineluctably to a conviction and an execution. Today, we believe that a fair and just trial depends on a falsification principle: With respect to each side’s narrative, there must exist (at least in theory) a narrative that would contradict it. The Court of Oyer and Terminer followed no such rule.

“What Everyone Knows” Today

Before we commence rolling our eyes about the resulting injustices in Salem, we should consider how “what everybody knows” continues to shape our law and our trials. And we should have enough modesty to acknowledge that, in the years to come, some of our presently unflagging convictions will doubtless be viewed as embarrassing nonsense. As Justice Holmes wisely observed in one of his most famous dissents, “time has upset many fighting faiths.” Abrams v. United States, 250 U.S. 616, 630 (1919) (Holmes, J., dissenting).

Numerous examples of this exist with respect to scientific evidence. For instance, over many years (extending to the 1980s), investigators believed that certain facts conclusively indicated that a fire had been started intentionally. These indicia included things like pour patterns in the burn marks on a floor or signs of extremely high temperatures in certain spots. Investigators thought that such evidence signaled the presence of an accelerant and therefore established arson as the cause. This chain of inferences became scientific gospel.

In the 1990s, however, scientists published research challenging these claims. Old and entrenched beliefs resist exorcism, so it took a while for that science to trickle down to courtrooms, prosecutors, and defense lawyers. But by 2004, it had become widely understood that for many, many years, arson investigators had simply misunderstood how fire behaves. It turned out that the factors they had identified were at least as consistent with an accidental blaze as with an intentionally set one.

“Expert” testimony based on that misunderstanding resulted in the incarceration of incalculable numbers of innocent defendants. In a sense, those wrongfully convicted persons were no less victims of mistaken and magical thinking than were the 19 people executed in Salem. Everybody knew something to a moral certainty, and everybody was wrong.

Nor is arson science an isolated phenomenon. Similar reversals have occurred with respect to other principles once taken as highly reliable (such as certain forms of bullet analysis), and debates rage on as to still more (such as evidence of “shaken baby syndrome”). See Caitlin M. Plummer & Imran J. Syed, “Shifted Science” Revisited: Percolation Delays and the Persistence of Wrongful Convictions Based on Outdated Science, 46 Cleveland State L. Rev. 483 (2016). We must never trivialize the tragedies of Salem, but numerically they pale in comparison with these blunders of our own era, whose human toll has been vast.

I conclude with this thought: The people of Salem believed that the devil was at work in their community. It turns out they were right—it just wasn’t the one they were after. This demon took the form of denial of counsel, rank hearsay, character assassination, and an unblinking confidence in “what everybody knows.” The New Testament tells us that when the devil failed to tempt Jesus, he went away—but planned to return at an “opportune time.” Our responsibility, as litigators, prosecutors, defense counsel, and judges, is to prevent that time from being our own.