chevron-down Created with Sketch Beta.
April 01, 2014

How Technology Broke Privacy

Privacy may go down in history as one of the great casualties of our unceasing march into the future.

Brian Pascal

Download a printable PDF of this article (membership required).

Attorneys, and especially litigators, are uniquely vulnerable to the shock that arises when the future hits the present. Consider the one constant in any litigation: citations to precedent. Precedent looks backward, trying to connect the present circumstances to past rulings. But by definition, innovation is made of things that haven’t happened yet. While it may be easier for attorneys to remain within the comfortable boundaries of precedent, it is vital to recognize that at some point empirical reality diverges from history and becomes something qualitatively new. This is especially true in cases that involve technology: Cars are more than fast horses, atomic bombs are more than really big cannons, and computational cryptography is more than a room full of mathematicians. And privacy in the modern world is inextricable from its technological underpinnings.

In a very real sense, privacy may go down in history as one of the great casualties of our unceasing march into the future. Maybe this is just societal growing pains: Privacy is an odd, hybrid thing cobbled together out of some combination of norms, laws, and technology. It varies from place to place, and it changes over time, and perhaps the reason that the privacy of today looks so different from the privacy of a half-century ago is because the world has changed so much in that period of time.

But maybe not. Maybe something else is going on, something that is both much simpler and much more troubling. Within the past couple of decades, we have seen a series of advancements that have fundamentally altered the way in which individuals interact with each other, with businesses, and with their governments. Whatever other effects these changes may have produced, they have inflicted great violence on our conception of privacy, wearing it down and weakening it. And throughout this period of great and tumultuous change, the legal community as a whole (with a few notable exceptions) has clung to its old, well-loved analytical tools and methods of understanding. The net result? In many circumstances, we are left with laws that are internally consistent and possessed of historical inertia but poorly suited to the world in which we now live.

It is far beyond the scope of this article to spin out a complete scientific and legal history that surrounds these innovations, with all the complicated dynamics that arise when innovation collides with an unready society. Similarly, it would be the work of an entire textbook to list all the laws that have ended up on the wrong side of history because they didn’t properly anticipate the future.

Instead, let’s consider three short stories about the effect modern technology has had on privacy, each showing how a relatively small, understandable set of technologies came together in ways that the law failed to foresee or accommodate. As you read them, try not to focus on what went wrong. Rather, think about how little it would take to do better. The right decision at the right time can change everything. Privacy in the 21st century is far from a lost cause.


Our Data Trails

We all shed information constantly, leaving a wake of data that trails behind us everywhere we go. This is just a fact of modern life, with so much of what we do every day mediated by electronic services. Every credit card transaction, every phone call, every email, and every web search routes information through centralized service providers. And every day, more and more of these companies spend more and more money looking to monetize this information, putting to work what they know about us.

This isn’t exactly a new idea. Credit cards have recorded this sort of information for decades, as have grocery stores with loyalty cards and, in the days before Netflix, video stores. We’re buying groceries anyway, and when analyzed in the aggregate, the purchasing habits of a large number of consumers can help a particular grocery store tailor its purchasing to the needs of its consumers. This is an economist’s dream: It’s taking information that otherwise would have been discarded and extracting real, tangible value from it. Even better, this extraction is often transparent to the consumer, who sees the result as lower prices, or better interest rates on credit cards, or some other similar benefit. Computers have magnified this effect, with companies spending huge amounts of time and computational power to derive statistical insights from their consumers’ shopping habits. Occasionally, though, these results can cross the line between effective advertising and downright creepiness. In one well-publicized scenario, Target was able to determine that a teenage woman was pregnant even before she had told her own family. It is reasonable to believe that countless other companies are pursuing these kinds of results with great fervor.

The arrival and subsequent explosion of the Internet expanded this idea to the size of the world. Service providers such as Google and Facebook began to play such a prominent role in the overall operation of the Internet that they found themselves possessed of enormous amounts of information, in the same way that U.S. control of the Panama Canal translated into enormous influence in global shipping. Thanks to the architecture of the network, recording this kind of information went from pretty easy (say, the difficulty of getting a shopper to sign up for a loyalty card) to effectively trivial, and the quality and detail of the information increased in inverse proportion.

In the early 2000s, this must have seemed akin to something like a farmer discovering the ability to spin straw into gold. Users, for the most part, may not have cared about last week’s search history and web browsing habits, but to companies like Google and Yahoo (and to their advertisers), that information was valuable almost beyond measure. It was worth so much that it enabled these companies to provide services with infrastructure costing billions of dollars to users absolutely free of charge.

When the social media juggernaut began to charge across the world in the mid-2000s, at first it looked like more of the same. In exchange for a detailed map of their social networks, users gained access to a rich suite of sharing tools that simply could not have existed before the Internet, though the revenue model was still mostly opaque to users, decoupled from their day-to-day experience. However, it did not take long for the clever scientists at these Internet firms to realize something profound. For reasons buried deep in human neuropsychology, users seemed willing, even downright eager, to provide these companies with more and more information, everything from favorite books and movies, to favorite restaurants, to even their favorite brands. This inverted the extant understanding of advertising: Instead of applying mass social psychology, surveys, and all the other dark arts of advertising to correlate their ads with viewers’ preferences, the viewers themselves told the advertisers what they wanted to see. Moreover, these companies also discovered that the more they knew about the user in general, the more accurately they could predict that user’s behavior. They may not know why a preference for a certain kind of bourbon correlated with a preferred brand of running shoe, but advertisers were more than willing to buy the conclusions without caring about the underlying reason.

For all its apparent harmlessness, this realization signaled a sea change in the business of social media companies, advertisers, and, really, almost all of the Internet. Suddenly, no piece of information about a user was too small to be recorded, no data point too insignificant. Firms found themselves searching for new ways to get users to disclose more and more about themselves, from spontaneously altering privacy policies to opting millions of users into services without their permission.

The result has been a shift in the balance of power between users and service providers. For a variety of social, biological, and neurological reasons, humans are not particularly good at assigning value to abstract objects, and the kinds of information that we voluntarily disclose to social networks are abstract in the extreme. Furthermore, service providers have incentives to take advantage of this kind of intrinsic shortsightedness. Their users expect the services to be free, and their shareholders and advertisers expect ever-greater returns on their investments.

This economic structure is at least part of the reason that online privacy appears to be eroding over time. Approached from this perspective, privacy as a commodity is not valuable in this particular marketplace—in fact, it’s anti-productive, as privacy limits the flow of information from users to service providers. And as troubling as this erosion is in the short term, there are hints that it may be even worse in the future.

What, one might ask, does this have to do with litigation? The answer, discussed more fully in Simon Goodfellow’s article elsewhere in this issue, might be that consumers are running out of patience with the largest Internet service providers. Over the past few years, we have witnessed a constant drumbeat of privacy-oriented class action lawsuits against companies like Google, Apple, Facebook, LinkedIn, and Netflix. The complaints have accused these companies of everything from “reading” their emails in the course of automated processing (In re Google Inc. Gmail Litigation), to allowing mobile applications to disclose personal information to third parties (In re iPhone/iPad Application Consumer Privacy Litigation, No. 5:11-md-02250-LHK, U.S. Dist. LEXIS 106865 (N.D. Cal. Sept. 20, 2011)), to employing a user’s “likes” improperly as advertisements, calling them “sponsored stories” (Angel Fraley et al. v. Facebook, Inc.).

While the vast majority of these cases end in either settlements or dismissals, they seem at least to stand for the proposition that consumers value their privacy quite highly, perhaps more than the Internet companies originally suspected and perhaps even more than the consumers themselves knew when they first signed up for these services. Moreover, given the relatively limited economic influence consumers can exert over these companies, these lawsuits are one of the few vehicles through which consumers can directly exercise their privacy interests on a noticeable scale.

Police Tracking

In 2004, the Washington, D.C., Metropolitan Police Department, working with the Federal Bureau of Investigation, began investigating nightclub owner Antoine Jones for possible narcotics violations. As part of their investigation, the officers installed a Global Positioning System (GPS) unit on Jones’s car and used it to track his movements without a warrant. In late 2011, the case of United States v. Jones reached the Supreme Court, and in January 2012, the Court held that this warrantless use of a GPS tracker violated Jones’s Fourth Amendment rights. The majority opinion relied primarily on a trespass argument, stating that it was the government’s placement of the device on Jones’s private property that generated the constitutional violation. Sifting through the various concurrences, it is also possible to tally five justices’ worth of support for a more expansive argument that has come to be known as the “mosaic theory,” a phrase first coined by law professor Orin Kerr. Under the mosaic theory, the emphasis would rest on the idea that Jones possessed a reasonable expectation of privacy in his physical location, and, regardless of method, it was improper for the police to acquire that information without a warrant.

Though the mosaic theory is an attractive argument, it was not the basis for the majority opinion (though Justice Sotomayor’s concurrence relied on an effectively identical justification). By declining to explicitly accept the mosaic theory (or a similarly flexible approach), the Court ensured that its holding in Jones would be so narrow as to be nearly obsolete the day it was handed down.

In an ever-increasing number of cities throughout the United States, police departments are installing “automated license plate readers” (ALPR, also called “automated number plate readers,” or ANPR, in Europe). ALPR systems are networks of cameras that capture the license plate of every car that drives by, along with a time stamp and geographical location. When the information is combined into a database and filtered properly, it is possible to track the movements of a car throughout a city without employing anything so clumsy as a physical GPS unit. In addition, it is possible to cross-reference the ALPR database with other police data sources. For example, if police receive a 911 call reporting gunshots, they could access the ALPR database and search for any vehicles registered to individuals who were arrested for gun crimes. An unlucky individual passing by a few blocks away might find himself the subject of a police investigation, based solely on the route he chose to drive and his past interactions with law enforcement.

Hypothetical scenarios aside, there are three very real facts about ALPR systems that many privacy advocates find troubling. The first is that ALPR is indiscriminate: By design, it captures the license plate of every car that drives by. This means that information about the movements of every individual driving through an ALPR-enabled city lies buried in police databases, whether or not those individuals are targets of an investigation. And, make no mistake, these data sets can contain substantial details about an individual’s habits. They can reveal the times of day when a mother leaves for work and picks up her children, the fact that an individual visits a mental health professional on Thursday afternoons, or the location of the cheap by-the-hour hotel that two married individuals frequent for late-night trysts.

The second concern that ALPR systems raise is one of scale. Thirty years ago, it was comparatively expensive in terms of manpower and logistics to tail a suspect. If the police wanted to follow an individual’s movements, they had to devote officers to the task, and, if they were wrong, that time and that effort were entirely wasted. Today, the same ALPR system that can monitor a thousand cars can just as easily monitor a million. The scale of this surveillance is subject only to the technical limitations of the ALPR system, and, in practice, this is not much of a limitation at all. Modern systems can track almost every vehicle in a city, in something approaching real time, and they can store data long enough to allow police to review an individual’s movements over the course of prior weeks or months. Legal scholar Woodrow Hartzog has written much about how some level of obscurity is necessary to maintain basic privacy and freedom within a modern society, and ALPR systems push in exactly the opposite direction. See Woodrow Hartzog and Evan Selinger, “Obscurity: A Better Way to Think about Your Data Than ‘Privacy,’” Atlantic, Jan. 17, 2013.

The third, and perhaps incongruous, difficulty with ALPR arises from the fact that these systems are almost certainly legal. It has long been an accepted legal doctrine that individuals have no reasonable expectation of privacy in their photographic image in a public place (see, e.g., Katz v. United States 389 U.S. 347 (1967)), and capturing the images of cars on public roads is all that ALPR systems do. As discussed above, these photos can paint a picture of an individual’s personal life far more detailed than anything that can be captured by a single telephoto lens. Despite this, much of Fourth Amendment law stubbornly focuses on single acts of capture rather than fully comprehending the power of aggregated information.

For all their ubiquity and “creepiness,” at first glance ALPR systems seem a possible boon to defense attorneys. If these systems capture so many subjects so broadly, then they would be hugely capable in the discovery of reasonable doubt. After all, any halfway-competent defense attorney, upon learning that his or her client was charged on the basis of information derived from an ALPR database, would surely subpoena the contents of that database—perhaps the disclosure of this database might even be required under Brady v. Maryland. But prosecutors have recognized this possibility as well, and it is exceedingly unlikely that we will ever see ALPR evidence introduced directly in court. This information is used for investigatory purposes, and the case that the prosecutor presents to the jury is built on other information, even if it would have been impossible to uncover these facts without the starting point provided by the ALPR database.

There is nothing inherently wrong with this practice: Bifurcating the investigation from the prosecution and presenting a coherent, clear story to the jury is a tactic as old as adversarial criminal procedure. However, when combined with the massive amount of information that law enforcement is able to gather about an individual from other sources (e.g., from the suspect’s mobile phone, as discussed in Yuri Mikulka’s and Sarah S. Brooks’s piece elsewhere in this issue), ALPR systems and tools like it have the potential to be put to use in a far more sinister way.

This is not mere paranoid speculation: In an exclusive story from August 2013, Reuters uncovered information suggesting that the U.S. Drug Enforcement Agency (DEA), while investigating ordinary crimes unrelated to terrorism, based their investigations on tips received from a top-secret Special Operations Division (SOD) within the DEA. This division was initially created to combat Latin American drug cartels, and it partnered with such agencies as the Federal Bureau of Investigation, the Central Intelligence Agency, and the National Security Agency (NSA). However, because the activities of the SOD were secret, DEA agents were encouraged to use a technique known as “parallel construction” to recreate an investigative trail, excluding the SOD information. See John Shiffman & Kristina Cooke, “U.S. Directs Agents to Cover Up Program Used to Investigate Americans,” Reuters, Aug. 5, 2013.

It is impossible to guess at how widespread this practice is, but it would be naïve to assume that the activities discussed in the Reuters article are the one and only instance in which information collected through the tools of international intelligence and counterterrorism was put to domestic use. Either way, the existence of large databases of information only makes the practice of parallel construction easier than it has ever been before.

It’s worth reiterating here two points I raised previously: We shed data constantly, especially in transactions mediated by electronic communications systems; and information in the aggregate is often far, far more powerful than the sum of its parts. Taken together, these ideas were the detonator that, in the past few years, sparked the explosive popularity of “big data” as a phrase, business model, national security asset, and philosophy.

Big Data Hypothesis

Despite its ubiquity, the term “big data” remains poorly defined. It is employed in an enormous variety of fields and applied to an even greater variety of problems. It encompasses everything from stock trading to predictive policing to counterterrorism, but all who seek to employ its techniques adhere to one crucial article of faith: Given a large enough data set and powerful enough analytical tools, it is possible to extract truths that would not be obvious from the individual data points. I call this the Big Data Hypothesis. There are certainly scenarios in which the Big Data Hypothesis applies—think about the descriptive power of ALPR information compared with that collected by simple red light cameras (in fact, in many places, ALPR is implemented using general-purpose big data tools). That said, as a matter of logic, mathematics, and science, it is far from clear that the Big Data Hypothesis is a universal, axiomatic truth.

As thoroughly modern as this concept may be, the underlying legal theory that is used to justify it is actually quite old. In Smith v. Maryland, 442 U.S. 735 (1979), decided in 1979, the Supreme Court held that the use of so-called pen registers (devices that recorded information about phone calls without recording the calls themselves—number dialed, duration of call, and the like) did not violate the Fourth Amendment. The idea was that the users of telephone services voluntarily surrendered this call information to the phone companies in the process of completing the call. This equivalently meant that the callers had surrendered their expectation of privacy in that information, which in turn allowed the agents of the state to collect it without needing a warrant.

Thirty-five years later, and aside from Judge Richard Leon’s stayed holding in the recent federal district court case of Klayman v. Obama, Smith and its progeny are still valid precedent, but the world is a very different place. In particular, Smith dealt with a single user of the telephone system, while big data analysis focuses on the patterns that might be derived from the aggregation of many users’ information. Non-content information about communications now has its own special name, “metadata,” and it now encompasses far more than the mere identities of those whom an individual called. Metadata can encompass everything from the nearest cell tower to the caller at the time the call was placed, to the routing information the call took to reach its recipient, and sometimes even the GPS location of a cell phone when it places a call. As before, it is difficult to overstate just how much it is possible to derive from this type of information. A recent study in the scientific journal Nature demonstrated that, subject to the resolution of a cell phone carrier’s antennas, 95 percent of the users on a given cellular network can be uniquely identified from just four data points indicating where their phones checked into the nearest cell tower. See Yves-Alexandre de Montjoye, César A. Hidalgo, Michel Verleysen & Vincent D. Blondel, “Unique in the Crowd: The Privacy Bounds of Human Mobility,” Nature, Mar. 25, 2013.

Well-known technologist, cryptographer, and engineer Bruce Schneier put it this way:

Imagine you hired a detective to eavesdrop on someone. He might plant a bug in their office. He might tap their phone. He might open their mail. The result would be the details of that person’s communications. That’s the “data.”

Now imagine you hired that same detective to surveil that person. The result would be details of what he did: where he went, who he talked to, what he looked at, what he purchased—how he spent his day. That’s all metadata.

Bruce Schneier, “Metadata Equals Surveillance,” Schneier on Security (Sept. 23, 2013).

The power of metadata, especially in light of the Big Data Hypothesis, has not escaped the government and, in particular, the national security community. Given, for example, a suspected terrorist, wouldn’t one want to know the identity of every person he called on his cell phone? And every person his contacts called, and when and where they were when they placed those calls? Perhaps it might be possible, from this information alone, to reconstruct the entire terrorist network and prevent a strike before it happens. What if that terror suspect called his attorney? Most individuals do not call their lawyers just to chat about the weather—the mere existence of the phone call is powerfully informative.

Section 215 of the USA Patriot Act allows the government to acquire business records from service providers, ostensibly for the purposes of fighting terrorism. Edward Snowden’s leaking of NSA documents in the summer of 2013 revealed the true breadth of this project: Based on secret authorizations by the Foreign Intelligence Surveillance Court (FISC), the NSA possessed and routinely exercised the ability to vacuum up enormous amounts of phone call metadata. Among other uses, this information was employed in a process known as “contact chaining,” in which the NSA could analyze not just the calls of a given target but also the records of any person the target called and any person those people called, and so on, up to three “hops” away from the initial target (though in a recent address, President Obama announced that this would be decreased to two steps). Numerically, this process could include the metadata of hundreds of thousands or even millions of users, many of whom were American citizens.

Although the law requires that these efforts be cleared by the FISC, Snowden’s documents have made it abundantly clear just how little oversight power the court actually possessed. Barton Gellman, reporting for the Washington Post, described one notable example: In 2008, the NSA inadvertently intercepted a large number of calls placed from Washington, D.C., when a programmer confused 20, the international dialing code for Egypt, with 202, the area code for Washington, D.C. Despite this collection existing far outside the scope of activities authorized by the FISC, the NSA did not distribute information about this error to its oversight staff, as the issue “only” involved the collection of metadata. General Keith Alexander, director of the NSA, sought to characterize these errors as the kinds of technical glitches that occur in any new system. In a 2009 declaration before the FISC, he even went as far as to say “from a technical standpoint, there was no single person who had a complete technical understanding of the system architecture” used to collect and analyze the bulk metadata information.

To its credit, the FISC did censure the NSA for these behaviors, even going so far as to say that some of its arguments “strained credulity.” But in the end, the courts were limited by their reliance on the NSA’s own disclosures for information about the NSA’s activities. The implications of this structure have dominated the news cycle for months. Story after story has shown how the NSA engaged in behavior that skirted the very edges of the law, engaged in clever word games, and even on occasion allegedly lied to Congress and the courts about its activities. As a practical matter, effective oversight was nearly impossible. Jacob Sommer’s article on page 40 of this issue provides an even more detailed description of some of the institutional failures of the FISC.

While it is impossible to lay the blame solely at the feet of the underlying technology, it did play a substantial role. Technology unavoidably injects complexity into any system, and when the system is as large and unwieldy as the U.S. national security apparatus, that complexity can become quite chaotic before it becomes visible to the larger world. And when the world does take notice, as it has following the Snowden disclosures, it is more likely to be stunned by the depth and twistiness of the rabbit hole.

There are many lessons we can draw from these stories, but arguably the most important one is this: When technology interacts with the law, the law will always lag behind. This has been true ever since the creation of the printing press, the first cars, and, on a grander scale, even the atomic bomb. Each of these technologies changed the entire world, and in each case, it took decades for the law to catch up.

When it comes to privacy, though, we may not have decades. The various technologies I have described in this essay, along with countless others, are reshaping our modern conception of privacy on an almost daily basis, and it’s happening with minimal substantive input from the legal community. Privacy, whatever else it may be, is at least partially a legal construct, and it becomes problematic when the theoretical definitions used by attorneys, judges, and lawmakers diverge from the empirical realities created by engineers and scientists.

Trial lawyers, more than any other attorneys, balance on a tightrope held taut between precedent and innovation. Cases are built on present facts but can be decided on the basis of rulings that are decades old. The principle of stare decisis requires that lawyers and courts defer to precedent where possible, but in today’s rapidly evolving world, deciding where to draw the line between what was and what will be is an essential skill for any attorney.

Change is especially prevalent in the world of privacy arising from rapid advances in communications technology, computer science, engineering, and mathematics. It is the job of the attorney to ensure that the law keeps up with reality. For us to play our assigned role in ensuring the future of privacy, we must become better as a profession at understanding the technical operation of the innovations that affect it, whether that means recruiting more lawyers with technical backgrounds, hiring consultants, or simply putting in the time to learn it for ourselves. This may sound like a difficult task, but it’s far from impossible, and the first step is to understand where our particularized set of knowledge comes up short.

We live in the future now, whether we know it or not, whether we like it or not. Lawyers need to learn how to keep up.

Brian Pascal

The author is a research fellow with the University of California Hastings Law School Institute for Innovation Law and a fellow with Stanford Law School’s Center for Internet and Society.