By the sweat of your browsers you shall eat your bread.
Profits from technological revolutions mostly inure to the benefit of those who first discover the means to produce valuable output from their discoveries. Manufacturers found ways to harness power, machines, and labor to produce saleable goods in the industrial revolution. Radio, television, and telephone companies restricted consumer choices for their own economic benefit. Internet companies have matched both raw and processed data with consumers and businesses willing to pay for such information, in some cases reselling to them the very data they themselves produced.
However, just because a small cadre of capitalists has managed to grab most of the early income from these changes does not mean that society cannot question the fairness of the money distribution or the value to our economy of the rising new order. From Smith to Marx, from Teddy Roosevelt to the telephone trustbusters to U.S. v. Microsoft, important economic thinkers have analyzed how value is created with new technologies and whether certain economic actors are hoarding more than their fair share of rewards. Such analysis of the new information economy is beginning to be spoken aloud and may soon seep into government policy. Senator Elizabeth Warren, on the campaign trail for 2020, just endorsed a regulatory plan aimed at breaking up some of America’s largest tech companies, including Amazon, Google, Apple, and Facebook. Tim Wu, writing in The New York Times, suggests that for democratic hopefuls in the upcoming presidential election, the problem of monopoly power may be the issue. The Economist recently opined that “if governments don’t want a data economy dominated by a few giants, they will need to act soon.”
We know that data about your behavior has value. An entire information economy enabled by the internet and digitization of transactions, led by Google, Facebook, and Amazon and built on the activities of everyday people, generates billions of dollars each year. Currently, these digital giants are by far the primary beneficiaries of this value, but that is changing and fast. The individuals generating the data and the governments regulating it are mobilizing to stake their economic claims in the value of data.
The question of valuing data will be crucial in the face of two of the defining political issues of our day: rising income inequality and the oncoming (autonomous) train of artificial intelligence. In fact, to some academics, the very way we find purpose in our lives may well depend on the manner in which we as a society choose to give value to data.
In his recent State of the State address, California Governor Gavin Newsom endorsed an idea that has been circulating in the academic and editorial press: should we treat consumer data as the fruits of “labor” or as a public resource worthy of taxation? Governor Newsom suggested that California would look beyond data privacy regulation toward the creation of a “data dividend,” funded by taxes, to compensate the producers of the data upon which our Internet-of-Things economy is increasingly based. Common Sense Media, which helped pass the California Consumer Privacy Act (CCPA) last year, plans to propose legislation in California to create such a dividend. The proposal has already proven popular with the public; one recent poll showed 45 percent of California voters support the idea, whereas only 28 percent are opposed.
Government entities both here and abroad are looking at ways to tap into data as a source of revenue, as Quebec has done by subjecting digital content, cloud-computing services, and digital content platforms to sales taxes.
However, what are “data”? To pose the question in economic terms, are data “capital” or “labor”? The “data as capital” (DaC) school considers data “the natural exhaust from consumption,” free for any capitalist with the means to exploit and profit from it. Contrariwise, the “data as labor” (DaL) crowd sees data as the possessions of their creators, who should properly be compensated for producing them. Of course, just lumping data into these two historically convenient buckets only begins the inquiry.
Data as Labor
The idea that work gives purpose and dignity to our lives is central to our being. As Nelson Mandela said, “Let there be work, bread, water, and salt for all,” notably putting “work” before “bread.” In the age of machine learning (ML), can we find “digital dignity” as data miners for Google? Are the owners of digital platforms, the factories of our day, ready to pay miners for better data to feed ML and AI?
The blueprints behind this kind of social engineering were written long ago. They are based on the idea that the creation of data is “work” by those laboring in the vineyards of social media and e-commerce, and those workers should be compensated for their labor. As Eric Posner and E. Glen Weyl have proposed, “by treating Data as Labor (DaL) not only can we build a fairer and more equal society, but we can also spur the development of technology and economic growth.”
This kind of reasoning harkens back to Karl Marx’s theory of alienation. Even in 1844, Marx observed a modern, technologically developed world apparently beyond the full control of the masses. (If Marx were alive today, as Randy Newman trenchantly observed, Ol’ Karl might well have been glad he was dead!)
Assessing Marx’s concept of “estranged labor” in his Economic and Philosophical Manuscripts of 1844, David L. Prychitko wrote:
People are required to work for capitalists who have full control over the means of production and maintain power in the workplace. Work, [Marx] said, becomes degrading, monotonous, and suitable for machines rather than for free, creative people. In the end, people themselves become objects—robotlike mechanisms that have lost touch with human nature, that make decisions based on cold profit-and-loss considerations, with little concern for human worth and need.
Marx could hardly have been more prescient in his assessment of the effect of technology on the nature of work. After all, we now live in a world where your car drives itself, your opinion is shaped by malign Russian chatbots, and your boss is an AI. Little wonder then that we see efforts to put “free creative people” back in the driver’s seat.
In Europe, the impactful General Data Protection Regulation is based on 30 years of treatment of privacy as a basic human right, with the core assumption that data created by a person “belongs” to that person and can only be exploited for profit if the data subject consents. Canadian law bases its privacy interpretation on a similar reading of rights to data. This includes a human “right to be forgotten” in which economically valuable data must be deleted by a business at the behest of the original data subject. California already enforces a law based on this core set of rights: the California Eraser Law, effective January 1, 2015. This core assignment of value to the role of the data subject is the same rights-based thinking that animates the DaL debate.
Data as Capital
DaC theorists see data more as a natural resource or as raw materials and look toward programs like the Alaska Permanent Fund, which channels a percentage of all oil royalties into a general fund to be distributed to all Alaskan citizens. Alternatively, to use the internal combustion simile, data are “exhaust” we all create from running our search engines. Using that analogy is, in fact, a good way to think about the issue. Exhaust contains myriad elements, some valuable, some noxious. Some drivers produce exhaust solely as a result of being consumers; some produce exhaust in the course of productive activities for someone else. If you drive for Uber or Lyft, you arguably are doing a little of both.
Let’s suppose (and someone may already have perfected this) that someone creates a “smart scrubber” that not only removes pollutants from vehicle exhaust, but also recaptures commercially valuable elements or compounds that could then be sold. The analogy is almost exact. Is that exhaust “free” for the exploiting? Should drivers be compensated for creating the raw material, even if the consumption of fossil fuels is essentially mindless? Should the companies that maintain fleets of vehicles get the spoils? Should the government, which built the highways, get a slice? How about the energy companies that drilled and refined the fuels? The pipeline companies that transported it to the convenience stores where you filled your tank? The auto makers that allowed for the creation of the exhaust in the first place? The reach could be extended almost indefinitely.
Now apply those thoughts to a financial transaction. You use your search engine to buy something on Amazon, creating your little puff of data “exhaust.” Amazon runs that exhaust through its “smart scrubber” and picks out the valuable data elements, which it then sells. You think it’s just a bilateral barter transaction—free data for free, two-day shipping—if you think about it at all. However, you have now created something that a huge number of people think they now own. In a simple mobile purchase transaction, that throng could include you, your merchant, your credit-card company, an ISO, one or more processors, your wireless provider, your phone handset provider, your loyalty program provider, banks and delivery drivers, and even state and federal agencies. Now what if you are buying that item for work and getting reimbursed? Should that data now belong to your employer, like frequent flier miles earned on flights for work?
Each of these digital “sooners” protects its stake from other “claim-jumpers.” In decades past, when you walked into a store and bought a pair of socks, the store likely kept the data about you and shielded it from competitors. (Retailers have known for many years how valuable transaction data can be for them. Merchants today fear disintermediation from their own customers more than almost all other competitive challenges. SKU-level data are the crown jewels.) You would soon appear on the store’s marketing lists and be offered perks for your loyalty in the form of BOGOs and cents off your gasoline fill-ups in loyalty programs. Thus, even if they are not actively selling those data, or derivatives like advertising, to third parties, companies treat transactional data as an asset—an asset that belongs to them.
Those who conceived and built the “scrubbers” believe there are problems with treating data as a renewable natural resource. Antonio García Martínez, a former product manager for Facebook, the CEO-founder of AdGrok, and a former quantitative analyst for Goldman Sachs, has opined that, “[t]he real value of data to a company like Facebook or Google is how it helps lure you to one of their services and keep you coming back.”
Martinez posits that, unlike a natural resource such as an oil reserve, the value of a dataset is found in its combination with other data elements, not in the dataset itself. He observes that the data used in creating targeted advertising isn’t even data that a collector (like Facebook) actually has; value lies in the combination of those data with other data that live offline. To Martinez, the proper metaphor isn’t oil, it’s TNT. If it’s the combination of data that has the real value, and that value is being created by the “work” of data collectors, not the data producers, why should the consumer necessarily profit beyond the basic barter transaction: “free” content and services the data collector provides in exchange for personal data?
This distinction seems both specious and self-serving in that it ignores the fact that, although refiners indubitably add value in the production of fuels and other petroleum-based products, that does not mean that the raw materials do not themselves have their own significant intrinsic value. The real distinction, if there is one to be drawn, between an oil reserve and a database is that no one living today can plausibly claim a hand in having produced that oil, whereas we can all say we contributed to any number of databases.
Some scholars have argued, pace Martinez, that internet users are both consumers and producers: “prosumers.” Christian Fuchs, chair professor in Media and Communication Studies at Uppsala University’s Department of Informatics and Media, in a paper on “Google Capitalism”, observed that these prosumers produce a commodity through their user activity and “engage in permanent creative activity, communication, community building and content production”. In “Means of Communication as Means of Production” Revisited, William Henning James Hebblewhite discussed the relationship between these prosumers and the platforms with which they interact:
As a means of production, the Internet, or in particular, web-based companies such as Google, Facebook and YouTube are able to take the raw material of information that is provided to them by the user and use tha t information to create new products, whether that be new online games designed to have the user invest time and money or simply a new addition to their integral system which gets such companies more users.
This new definition seems both unnecessary and not particularly helpful in determining how data should be valued. It does not matter so much what hat one is wearing when one creates data; what matters is whether one should receive the wherewithal to help keep one’s head warm as a result of that activity and if so, how much and from whom?
In short, the data underlying this new economy comprise the timely description of the activities, priorities, and preferences of real people, and the economic value from these data is derived by using this prosumer information to drive a particular prosumer, or like-minded groups, to make future economic decisions. To retailers, those decisions are potential sales. To subscription services, those decisions demonstrate the value of the service to the prosumers.
The old marriage adage goes, “Why buy the cow if you can get the milk for free?” Online services have spent enormous sums of capital based on what is essentially free milk: rustling up the docile free-range cows, building the #pens and #milkingsheds, milking them for transactions and preferences, turning the milk into Greek yogurt, artisanal cheeses, and whey protein, and selling these products to willing buyers. (In Amazon’s case, it has thrived by selling the milk back to the original producers, as well as others.) Few would argue that these companies do not deserve to be compensated for all that cost and effort (and for providing something of value to keep the cows coming back to the trough), but do they deserve all of the compensation?
Taxation or Compensation?
As income inequality grows, and more workers become redundant, or as at least one social scientist has put it, irrelevant, as a result of AI and ML, and socialism is no longer the economic theory that dare not speak its name, politicians and regulators are looking toward the value of data as a way out. If the Bezosians and Zucker-burghers of the world control the means of production of social media and e-commerce, should we help address income inequality by taxing them and redistributing their corporate profits to the alienated “prosumers,” or should we simply treat data as if it were a natural resource or raw material exploited by a few large corporations that must pay for the privilege?
Those who have proposed these schemes are long on ideas and short on methodologies or practical solutions. Is there a technical method of accounting for the depth of a user’s internet activity and allocating funds accordingly? Should every consumer get the same dividend irrespective of his or her contribution to the digital economy? Is every entity that uses data subject to taxation? Are we ready for labor unions representing data subjects bargaining collectively with the beloved/despised forces of Facebook, Google, and Amazon? Would data subjects be willing to forgo their Prime orders, Google Maps directions, and Facebook “likes” until they got what they wanted? The calls to “Delete Uber” did not reflect an elevated societal consciousness.
The recently amended CCPA is a good example of a state legislature trying out a DaL plan. Instead of a generally available data dividend, as Governor Newsome wants studied, the CCPA instead contains what appears to be a vigorous nod toward bilateral compensation arrangements between consumers and data collectors. Businesses are encouraged to offer a “data royalty” or “information incentive” to their customers under the CCPA in that it explicitly sanctions such “pay to play” arrangements. Section 1798.125(b) of the CCPA provides:
(1) A business may offer financial incentives, including payments to consumers as compensation, for the collection of personal information, the sale of personal information, or the deletion of personal information. A business may also offer a different price, rate, level, or quality of goods or services to the consumer if that price or difference is directly related to the value provided to the consumer by the consumer’s data.
(2) A business that offers any financial incentives pursuant to subdivision (a), shall notify consumers of the financial incentives pursuant to Section 1798.135.
(3) A business may enter a consumer into a financial incentive program only if the consumer gives the business prior opt-in consent pursuant to Section 1798.135 which clearly describes the material terms of the financial incentive program, and which may be revoked by the consumer at any time.
(4) A business shall not use financial incentive practices that are unjust, unreasonable, coercive, or usurious in nature.
Skepticism about DaL schemes like the California data dividend already is in the air. Throwing shade on the governor’s plan to study data dividend schemes, Owen Thomas recently wrote in The San Francisco Chronicle:
It will take months to report back what should be obvious to anyone who has an inkling of how online data juggernauts operate: If you want Facebook and Google to pay more to ameliorate the social ills they cause, just raise their taxes.
Thomas’ view is illustrative of the way we tend to think about carbon taxes and “cap-and-trade” plans: as retribution or compensation for damage caused by commercial activity. Those who profit from commercial activities that create pollution as a by-product of their use of natural resources should compensate society for the harm to the environment they cause in the process. This thinking seems to animate the latest French swipe at large, U.S.-based data companies: an enormous data tax bill. The bill would apply to digital companies like Google, Amazon, Facebook, and Apple, with worldwide revenues over 750 million euros ($848 million), including French revenue over 25 million euros. Justifying the new tax, French Finance Minister Bruno Le Maire clearly drew the battle lines: “This is about justice . . . . These digital giants use our personal data, make huge profits out of these data . . . then transfer the money somewhere else without paying their fair share of taxes.”
Viewed differently, however, we could easily think of such taxes as payment for the use of raw materials (that theoretically belong to us all) to create something that benefits society. Reframed in this way and applied to data, a new way of thinking about the value of data emerges.
If data taxes are anathema to some, it may help to recast such imposts as rents or license fees for the use of a renewable resource we all have a hand (or a mouse) in creating. The aggregate rents on the use of data, or digital exhaust, could be funneled into any number of programs to help citizens continue to find dignity in their lives as the nature of work changes, such as:
- a “superfund” to help compensate those harmed by cyber crimes or to strengthen our nation’s defenses against cyber warfare;
- retraining programs for those workers displaced by ML and AI;
- a new WPA or CCC to fix our broken infrastructure;
- expanding rural internet connectivity; or
- securing the 5G network.
All of these programs are of a piece with the zeitgeist of the Green New Deal. As the digital divide grows, in some manner or another valuing and taxing data could help build the necessary bridges for more of us to cross over to lives of dignity and purpose in the age of data.