July 01, 2020 Feature

Investigative Genetic Genealogy and the Future of Genetic Privacy

By Natalie Ram

In April 2018, police arrested Joseph James DeAngelo, alleging that he is the elusive Golden State Killer, who is believed to have committed more than a dozen murders and fifty sexual assaults throughout the 1970s and 1980s. The break in the case came from searching a familiar source—DNA—in an unfamiliar place—a free, online genealogical DNA database called GEDmatch. By comparing a DNA profile extracted from decades’-old crime scene evidence to the DNA profiles searchable through GEDmatch, investigators identified a distant cousin of the putative Golden State Killer. Through sleuthing in that extended family tree, investigators eventually homed in on DeAngelo.1

Law enforcement interest in and use of similar techniques has rapidly materialized and continues to grow. Parabon Nanolabs, the first private company to capitalize on law enforcement interest in investigative genetic genealogy, claims that it has already identified more than 100 suspects this way.2 Other for-profit companies have also entered the market, including Verogen, a forensic genetics firm that recently acquired GEDmatch, and Othram, another forensic genetics firm that launched its own consumer genetics platform for law enforcement use. Law enforcement has also gained the cooperation of another significant consumer genetics platform: FamilyTreeDNA acknowledged in January 2019 that it had already been cooperating with the FBI to compare crime scene and consumer DNA profiles. However, the biggest players in the consumer genetics space, 23andMe and Ancestry, have announced that they will resist cooperating with law enforcement.

Investigative genetic genealogy, also known as IGG, raises significant questions and concerns about the relationship between the police and the people, and about the role of the state in mediating that relationship. The benefits of IGG appear clear and substantial. Identifying and prosecuting the perpetrators of serious violent crimes, like the Golden State Killer, is a victory for public safety. Nevertheless, IGG threatens to degrade privacy for all Americans. The loss of privacy may not be a justifiable cost to civil society, much less justified in every instance of criminal investigation to which it could be applied.

Policymakers are only now beginning to grapple with whether, and under what circumstances, IGG ought to be permitted. The Department of Justice has issued an interim policy governing IGG, but that policy is binding only on federal investigators and those receiving federal grants specifically earmarked for IGG purposes.3 A handful of California district attorney’s offices have also entered into memoranda of understanding regarding this technique.4 Meanwhile, at least two states—Maryland and Utah—have introduced legislation that would regulate or prohibit this new investigative technique.5 However, these responses are piecemeal or not yet enacted, leaving IGG almost entirely unregulated under current law.

IGG arises at the intersection of at least three boundary-pushing investigative and genetic identification techniques. These techniques subvert and invert existing limits placed on police genetic surveillance, though these difficulties may be beyond the capacity of courts to correct. As current regulatory efforts show, state legislatures and attorneys general may be the best-suited actors to bring IGG within moral and legal bounds.

law enforcement and Investigative Genetic Genealogy

First, IGG utilizes consumer genetics platforms, like those operated by GEDmatch or FamilyTreeDNA, to generate investigative leads. Law enforcement use of genetic data may be routine, but the Golden State Killer case marked one of the first times police successfully made investigative use of a DNA database not created for law enforcement purposes.

All fifty states and the federal government collect, store, and share genetic information for law enforcement purposes through the Combined DNA Index System (CODIS).6 Federal, state, and local forensic DNA laboratories enter lawfully obtained genetic profiles into the CODIS “offender database.” Crime scene DNA profiles are stored in a separate CODIS index.7 Today, CODIS contains genetic profiles from millions of known felons, misdemeanants, and even arrestees.

Investigative use of consumer genetics platforms, however, differs significantly from CODIS searches in myriad ways, including the scope of data at issue. Jurisdictions participating in CODIS must define precisely which individuals are subject to inclusion in CODIS, and each has done so by statute. While the scope of includable individuals has expanded over time, no state has authorized the collection and search of DNA from ordinary citizens for forensic investigative use. DNA profiles from mere “volunteers” are not permitted to be stored in CODIS for crime detection purposes. Yet consumer genetics databases are populated with genetic data from millions of “volunteers”—persons with no known law enforcement connection.

The intrusiveness of available genetic information resting in consumer genetics databases also far outstrips its statutorily authorized counterpart. A CODIS profile consists of forty data points drawn from twenty highly variable, but noncoding, regions of the human chromosomes. These data points are designed to be maximally informative about individual identity, but minimally informative about anything else. Legislatures and courts have repeatedly described the genetic data used for CODIS profiles as “junk,” and have therefore concluded that its use entails only a minimal invasion of privacy.8 Although the “junk” label is scientifically inaccurate even as applied to CODIS profiles,9 it is even less appropriate with respect to consumer genetics profiles. The latter are “far broader and more information-rich,” consisting of several hundred thousand DNA data points throughout the human chromosomes.10 These data are highly revealing, not only about an individual’s genetic relatives and ancestral origins, but also often about her physical traits, health risks, or other potentially sensitive information.11

Finally, legal frameworks surrounding CODIS include meaningful limitations and protections on the data it contains that are starkly absent from consumer genetics. State legislatures determine whose DNA will be subject to inclusion in CODIS and for what crime detection purposes. Under federal law, laboratories that participate in CODIS must be accredited and comply with federal quality assurance standards.12 In contrast, each consumer genetics platform is a world unto itself. Existing law imposes no accreditation requirements or quality assurance standards on consumer genetics laboratories. Each platform determines how its data will be stored, accessed, and shared, and for what purposes. Many platforms reserve to themselves a unilateral right to change these features at any time and without prior notice.13 Researchers have identified multiple security risks at GEDmatch, including data leakage that could reveal a user’s sensitive individual genetic markers.14

Law enforcement’s use of this new database source raises a host of legal questions. For one thing, such use is arguably inconsistent with established legislative frameworks about whose genetic data ought to be available to law enforcement for routine database search.15 As mentioned, every state has enacted a statutory scheme regulating its official law enforcement DNA database and participation in CODIS, and these statutes might impliedly preempt law enforcement access to other DNA repositories, particularly where such access cannot independently be justified.16

Moreover, law enforcement use of consumer genetics platforms, at least in the absence of a warrant, may well intrude on consumers’ expectations of privacy, and thus run afoul of the Fourth Amendment of the U.S. Constitution. Traditionally, courts held that information that an individual knowingly or voluntarily shares with a third party falls outside the scope of constitutional concern—a rule known as the third-party doctrine.17 However, in the recent case of Carpenter v. United States, the Supreme Court declined to apply the third-party doctrine, holding that individuals retain an expectation of privacy in their cell-site location information, despite the collection of those data by their cell phone providers.18 Furthermore, at least one justice in Carpenter plainly indicated that he would not be inclined to bless warrantless law enforcement access to consumer genetics data. In his “dissenting” opinion, Justice Gorsuch explained that it “strikes most lawyers and judges today—me included—as pretty unlikely” that the government could “secure your DNA from 23andMe without a warrant or probable cause.”19

The Supreme Court’s decision in Maryland v. King provides no sanctuary for use of consumer genetics data. In King, the Court held that Maryland did not violate the Fourth Amendment by including mere arrestees in its official DNA database, searchable through CODIS.20 In reaching this decision, the Court emphasized that law enforcement has a legitimate interest in knowing the identity of individuals in custody.21 IGG, by contrast, is designed to probe DNA profiles of individuals who have no known law enforcement connection, and whom the government has no particular need to specifically identify. King also expressly reserved the question of whether the Fourth Amendment proscribes law enforcement access to genetic analysis related to, “for instance, an arrestee’s predisposition for a particular disease or other hereditary factors not relevant to identity.”22 In fact, that kind of highly sensitive data is an intrinsic part of most consumer genetics profiles. Finally, King emphasized that “statutory protections . . . guard against further invasion of privacy . . . [and] unwarranted disclosures.”23 As set forth above, however, consumer genetics privacy protections are subject to the whims of each individual platform operator—and potentially no legislative or regulatory framework restraining law enforcement at all.

Familial DNA searches and the Fourth Amendment

Second, these legal questions have proven difficult for courts to adjudicate because they involve familial identification, rather than a search for a direct match with the actual suspect. In every IGG-facilitated arrest announced thus far, the individual ultimately arrested did not place his own genetic data in a consumer genetics database; rather, law enforcement exploited partial matches, indicating that a relative of the putative perpetrator had made his or her DNA available in the database. The use of partial matches indicating familial relationships raises difficult questions, both legal and otherwise. Ordinarily, law enforcement—searching in CODIS—seeks direct matches; that is, that the suspected offender’s DNA profile exactly matches the crime scene profile and therefore the suspect likely committed the crime in question. This is sensible. Individuals identifiable through the offender database are supposed to have some connection to law enforcement already, such as an arrest or conviction. But relatives of an included offender may not have any such connection themselves. Familial searches are most productive when genetic relatives are not otherwise in the known offender database. After all, if the relative’s DNA were included directly, a familial identification would be superfluous.

But a growing number of states have embraced familial searches in their own state offender databases, making the genetic relatives of individuals in the database (who are not themselves otherwise includable) implicitly searchable. Two jurisdictions, Maryland and the District of Columbia, expressly bar such searches by statute. The FBI, for its part, has permitted, but not embraced, the use of familial searching in CODIS.24

Familial searches frustrate ordinary principles of Fourth Amendment analysis. In the context of IGG, proponents have argued that law enforcement use of consumer data is permissible because the data have been voluntarily shared on a consumer genetics platform. But even if that is so for the individual whose DNA is directly catalogued in a database, the same cannot be said of that individual’s genetic relatives. Nearly all genetic ties are thrust upon us, rather than voluntarily undertaken. Children inherit half of their genetic data from each parent, and they are statistically likely to have about half of the same genetic data as their full siblings.25 Less genetic similarity is observed among more distantly related individuals, with shared genetic data decreasing in predictable proportion to the degree of relationship between two individuals. An individual chooses none of these genetic ties. Rather, the genetic patterns that make familial identification possible arise as a product of biology. These are ties that cannot be either controlled or escaped. Familial searches thus subvert the expectations of privacy of individuals who, through no fault of their own, are relatives of people who have been arrested or convicted or have submitted their DNA to a consumer genetics service.

IGG further exacerbates these privacy harms by extending the family tree it makes implicitly searchable. Because of the limited genetic data analyzed for CODIS profiles, familial searches utilizing CODIS profiles generally reach only first-degree relatives, like parents, children, and full genetic siblings. IGG sweeps much more broadly, giving rise to a wide-ranging web of genetic relatives whose DNA investigators may utilize to identify an unknown suspect. According to one study, with access to consumer genetics data from as little as two percent of the U.S. population, as many as 90 percent of Americans of European descent would be identifiable through a third cousin or closer.26 Under these circumstances, it is simply unmanageable for an individual to know of each of his genetic relatives, much less monitor and persuade each to protect his genetic privacy. Indeed, the broad identifiability that IGG makes possible risks giving rise to a de facto universal DNA database for Americans—something that no jurisdiction has indicated would be appropriate.

Yet, Fourth Amendment law makes it difficult for an individual identified through IGG to defend against being identified in this way. Ordinarily, in order to mount a Fourth Amendment challenge, an individual must have his or her own expectation of privacy violated by the challenged search or seizure.27 It is not clear, however, whether an individual’s own expectation of privacy is violated when his or her relative’s genetic data are the subject of an IGG search. The genetic data that were searched were drawn from a relative’s cells, and not the ultimate defendant’s, though those data make both individuals identifiable. The law is not well equipped to grapple with the difficult nature of shared genetic material. Courts should look to other forms of shared property to illuminate analysis about when an individual has a sufficient interest in searched genetic data that do not derive from her own cells.28

In addition, the Supreme Court’s Carpenter case aids in analysis here too.29 Carpenter, after all, held that an individual may have an expectation of privacy in data that are deeply revealing about him, even if those data are held and owned by another. So too, courts should recognize that individuals may challenge law enforcement searches of genetic data that implicitly include them, in light of the sensitive data that genetic information contains and the involuntariness of familial genetic associations. However, the only court to squarely address a Fourth Amendment challenge to IGG denied this claim, holding that the defendant lacked standing to challenge the search at all.30

Inverting the Traditional Investigative Model

Finally, there is a third, additional facet of IGG that makes this practice boundary-bending. IGG inverts the traditional investigative model of first identifying a suspect or person of interest and then conducting a search. IGG instead deploys a broad search of unknown persons first in order to identify a connection to a possible suspect whose profile investigators possess. IGG has had its biggest impact in enabling law enforcement to reinvigorate otherwise-cold cases. In the Golden State Killer case, for instance, investigators did not have a particular suspect in mind when they uploaded their crime scene DNA profile to GEDmatch and searched it against all of the consumer profiles already there.31 Law enforcement in other cases has more bluntly acknowledged that suspicion proceeded from a consumer genetics search, rather than preceding it. In discussing an arrest and confession in another case, the prosecutor similarly explained that, but for the link uncovered through an IGG search, “[n]othing else would have drawn [the suspect] to our attention.”32

Although the closure of these long-cold cases is commendable, this facet of IGG sits uncomfortably in the Fourth Amendment context. Ordinarily, the government must have some suspicion that a particular person committed a crime before conducting a search. Suspicionless searches, by contrast, must typically be supported by “special needs,” that is, something more than a mere desire “to detect evidence of ordinary criminal wrongdoing.”33 Indeed, the Supreme Court has implied that law enforcement may not, consistent with the Fourth Amendment, compel even an individual’s name unless an officer already harbors reasonable suspicion that that specific individual may be involved in criminal activity.34 IGG, however, casts a wide net in hopes of generating some lead, any lead. IGG thus conducts its search of genetic data first, and from that search determines whom to suspect of criminal wrongdoing. In so doing, this technique disrupts traditional investigative—and Fourth Amendment—norms.

To be sure, IGG is not unique in inverting the traditional suspicion-then-search approach. Ordinary searches in CODIS are also suspicionless searches in search of a suspect. Yet, as set forth above, the justifications undergirding inclusion in CODIS—prior arrest or conviction, giving rise to diminished expectations of privacy for a defendant and weightier governmental interests in his identity—are absent in IGG.

Many other forms of big data surveillance raise similar concerns about dragnet searching. Here, too, the Supreme Court’s recent Carpenter decision may have purchase. Carpenter did not resolve the question of whether investigative inversions, like those at the heart of IGG, run afoul of the Fourth Amendment. Carpenter itself did not involve such an inversion; law enforcement first identified Mr. Carpenter as a suspect and then pursued his cell phone records without a warrant. The Court’s opinion in Carpenter, meanwhile, carefully declined to “express a view on matters not before” the Court, including “‘tower dumps’ (a download of information on all the devices that connected to a particular cell site during a particular interval).”35

Nonetheless, at least one sitting justice has suggested that Carpenter may well cast doubt on bulk data searches. In his confirmation hearings, Justice Kavanaugh, who had previously authored an opinion that the government’s anti-terrorism bulk telephone metadata collection program did not run afoul of the Fourth Amendment,36 acknowledged that Carpenter is “a game changer” and that he could not have written the same opinion in light of it.37

In (at least) these three ways, investigative genetic genealogy marks a radical shift in who may be subject to genetic surveillance and what types of genetic information may be searched. Investigative searches of consumer genetics data, particularly when combined with familial and dragnet searching, raise serious questions about whether meaningful genetic privacy is possible in the digital age. As both law enforcement appetite for consumer genetic data and the size of consumer genetic platforms continue to grow, the result is likely to become perpetual genetic surveillance for all. Whether courts are willing to make use of new doctrine and well-established norms to recognize boundaries to IGG remains to be seen.

Rather than rely solely on the courts in this arena, we should look to lawmaking bodies that may be able to bring oversight and guidance in developing law to govern the application of IGG by law enforcement and courts. With appropriate procedural and substantive safeguards, perhaps IGG can satisfy advocates and opponents alike. Indeed, robust “statutory protections that guard against further invasion of privacy”38 may well strengthen IGG’s constitutional footing.


1. See Natalie Ram, Incidental Informants: Police Can Use Genealogy Databases to Help Identify Criminal Relatives—But Should They?, 51 Md. Bar J., no. 3, July/Aug. 2018, at 8.

2. Emily Shapiro, She Was Sexually Assaulted and Killed in 1973. Now Genetic Genealogy Identified a Suspect., ABCNews.com (Feb. 28, 2020), https://abcnews.go.com/US/sexually-assaulted-killed-1973-now-genetic-genealogy-identified/story?id=69281238.

3. See U.S. Dep’t of Justice, Interim Policy: Forensic Genetic Genealogical DNA Analysis and Searching (2019), https://www.justice.gov/olp/page/file/1204386/download.

4. See, e.g., Sacramento Cty. Dist. Att’y’s Office, Memorandum of Understanding: Investigative Genetic Genealogy Searching, https://chia187.wildapricot.org/resources/Documents/Sacramento%20County%20District%20Attorney%27s%20Office%20-%20IGG%20MOU%20Example.pdf.

5. See S.B. 848 (Md. 2020); H.B. 30 (Md. 2019); H.B. 231 (Utah 2020).

6. See Maryland v. King, 569 U.S. 435, 445–45 (2013).

7. See 34 U.S.C. § 12592(a).

8. See Natalie Ram, Genetic Privacy After Carpenter, 105 Va. L. Rev. 1357, 1377–78 (2019).

9. See id. at 1379.

10. Id. at 1378.

11. Id. at 1378–80.

12. See 34 U.S.C. § 12592(b).

13. See Jessica L. Roberts & Jim Hawkins, When Health Tech Companies Change Their Terms of Service, 367 Sci. 745 (2020).

14. Antonio Regalado, The DNA Database Used to Find the Golden State Killer Is a National Security Leak Waiting to Happen, MIT Tech. Rev. (Oct. 30, 2019), https://www.technologyreview.com/s/614642/dna-database-gedmatch-golden-state-killer-security-risk-hack/.

15. See David M. Jaros, Preempting the Police, 55 B.C. L. Rev. 1149 (2014).

16. See id. at 1182–87 (making this argument).

17. See Smith v. Maryland, 442 U.S. 735, 744 (1979); United States v. Miller, 425 U.S. 435, 442 (1976). But see Carpenter v. United States, 138 S. Ct. 2206, 2219 (2018).

18. See 138 S. Ct. at 2219, 2220.

19. Id. at 2262 (Gorsuch, J., dissenting).

20. 569 U.S. 435.

21. Id. at 449.

22. Id. at 464–65.

23. Id. at 465.

24. See Combined DNA Index Sys., Bull. No. BT072006, Interim Plan for the Release of Information in the Event of a “Partial Match” at NDIS (2006).

25. See Natalie Ram, Fortuity and Forensic Familial Identification, 63 Stan. L. Rev. 751, 758, (2011).

26. Yaniv Erlich et al., Re-identification of Genomic Data Using Long Range Familial Searches, 362 Sci. 690 (2018).

27. See Byrd v. United States, 138 S. Ct. 1518, 1526 (2018).

28. See Natalie Ram, DNA by the Entirety, 115 Colum. L. Rev. 873 (2015).

29. Natalie Ram, Genetic Genealogy and the Problem of Familial Forensic Identification, in Consumer Genetic Technologies: Ethical and Legal Considerations (I. Glenn Cohen, Nita Farahany, Henry T. Greely & Carmel Shachar eds.. Cambridge Univ. Press forthcoming 2020).

30. See State v. Burns, No. FECR129718, at 5–8 (Iowa D. Ct. Feb. 6, 2020).

31. See Justin Jouvenal, To Find Alleged Golden State Killer, Investigators First Found His Great-Great-Great-Grandparents, Wash. Post (Apr. 30, 2018), https://www.washingtonpost.com/local/public-safety/to-find-alleged-golden-state-killer-investigators-first-found-his-great-great-great-grandparents/2018/04/30/3c865fe7-dfcc-4a0e-b6b2-0bec548d501f_story.html (“Criminal DNA databases produced no hits, sweeps of crime scenes no fingerprints and hefty rewards no definitive tips. But Paul Holes, an investigator and DNA expert, had a hunch he could create a road map to the killer through his genetics.”); Eric Levenson, It Started as a Hobby. Now They’re Using DNA to Help Cops Crack Cold Cases, CNN (Mar. 27, 2019), https://www.cnn.com/2018/08/03/health/dna-genealogy-cold-cases-trnd/index.html.

32. Sarah Weinman, The Cold Case Factory, Topic (Mar. 2019), https://www.topic.com/the-cold-case-factory (quoting Allen County prosecutor Karen Richards).

33. Indianapolis v. Edmond, 531 U.S. 32, 38 (2000).

34. See Hiibel v. Sixth Judicial Dist. Court of Nev., Humboldt Cty., 542 U.S. 177, 187 (2004); Libby Copeland, The Lost Family: How DNA Testing Is Upending Who We Are (2020) (“Something as simple as your name can’t be forced from you by police without suspicion.”) (quoting Erin Murphy).

35. Carpenter v. United States, 138 S. Ct. 2206, 2220 (2018).

36. See Klayman v. Obama, 805 F.3d 1148 (D.C. Cir. 2015).

37. Damon Root, Brett Kavanaugh Calls Carpenter v. United States a “Game Changer” on 4th Amendment Law, Reason (Sept. 13, 2018), https://reason.com/2018/09/13/brett-kavanaugh-calls-carpenter-v-united/.

38. Maryland v. King, 569 U.S. 435, 465 (2013).


By Natalie Ram

Natalie Ram is an Associate Professor of Law at the University of Maryland Francis King Carey School of Law. This work is supported in part by a Greenwall Foundation Faculty Scholars grant.