How Much Does AI Really Know?
AI’s ability to create highly accurate profiles of individuals is built on the vast amounts of data it consumes. It is not just data about what you’ve bought or what websites you’ve visited but also about your interactions with AI-powered devices. For instance, smart speakers, such as Amazon’s Alexa or Google Home, continuously listen to conversations and queries, potentially collecting sensitive personal information even when they’re not actively in use.
AI also uses non-obvious data to make inferences about individuals. Machine learning algorithms can predict health outcomes based on data gathered by wearable devices, which can reveal patterns related to a person’s activity level, location, and even sleep habits. Similarly, AI can forecast financial stability by analyzing your spending habits, subscription services, and even your social media posts. The more an AI system learns, the more comprehensive its knowledge becomes, making it possible for AI to predict future actions with unsettling precision.
The implications of this predictive power are far-reaching, as AI systems now hold personal data that can be used to create detailed profiles of individuals. These profiles might contain information that an individual never intended to share or even be aware of. The convenience of AI comes with a loss of control over personal information, and the resulting questions about privacy have yet to be fully addressed.
Who Has Access to Your Data?
One of the most significant privacy risks associated with AI is the question of who has access to the data that powers these systems. The information collected by AI systems is often shared with third parties, such as marketers, advertisers, and other companies that pay for access to user data. In many cases, individuals are unaware of the full extent of the data being collected and shared.
Data brokers, for instance, buy and sell personal information to organizations that use it for targeted advertising or to assess creditworthiness. Governments also have the power to collect personal data through AI systems, often without the consent of individuals. In some countries, government surveillance programs monitor citizens’ online activity, while in others, AI-driven facial recognition tools are used to track movements in public spaces.
This lack of transparency about who has access to personal data is a significant concern. In many cases, individuals are unaware that their data is being used or sold. Even when individuals do have control over their data, opting out of these data practices is often complicated and time-consuming.
Criminals are also targeting this data. In recent years, data breaches have exposed millions of individuals’ personal information, including names, addresses, financial details, and even biometric data. One of the most concerning aspects of AI is the way it can enhance the value of stolen data. AI systems often use machine learning algorithms to create highly detailed profiles of individuals, which can be used to assist in identity theft or to help hackers plan more effective attacks.
Case Studies: AI Abuse Across Industries
The ethical and privacy concerns surrounding AI are not hypothetical. There have been numerous real-world cases where AI systems have been used in ways that violate individuals’ rights. Below, we examine three examples from the legal industry, health care, and law enforcement.
Legal Industry: GitHub Copilot and Copyright Infringement
GitHub Copilot, an AI-powered tool developed by GitHub and OpenAI, assists developers by suggesting code snippets based on prior examples. In 2022, developers filed a lawsuit against GitHub and OpenAI, alleging that Copilot had been trained on vast amounts of open-source code without proper attribution or permission from the original authors.
The lawsuit, Software Freedom Conservancy v. GitHub, claimed that GitHub’s use of Copilot violated copyright laws by using proprietary code to train the AI without compensating or informing the original creators. The plaintiffs argued that Copilot’s suggestions were based on code that was copyrighted, leading to concerns about the erosion of intellectual property rights in an AI-driven world.
This case highlights a critical issue in AI development: the lack of clarity regarding the ownership of data used to train AI systems. While some argue that AI models can be trained on publicly available data under the doctrine of “fair use,” others believe this practice violates the intellectual property rights of creators. The case is still ongoing, but it underscores the growing legal and ethical concerns surrounding AI in industries such as software development.
Health Care: The Risks of Bias in AI Diagnosis
AI systems in health care promise to revolutionize diagnostics, but there have been significant concerns about their fairness and accuracy. A notable example is a 2019 study published in Science that found that an AI system used in health care had significant biases. The system, designed to predict health care needs and allocate resources, was trained on data that disproportionately reflected the needs of white patients, leading to a discriminatory allocation of health care resources.
In this case, the AI system exhibited a bias that favored individuals with more expensive health care needs, which typically corresponded to wealthier, often white, populations. As a result, the AI system underrepresented the needs of Black and Hispanic patients, exacerbating existing health care disparities. The findings of this study sparked widespread concern over the fairness of AI systems in sensitive sectors such as health care, where biased outcomes can have life-or-death consequences.
Law Enforcement: The Misuse of Facial Recognition Technology
Facial recognition technology has been deployed by law enforcement agencies around the world, raising significant concerns about privacy and civil liberties. In 2018, the American Civil Liberties Union tested Amazon’s facial recognition system, Rekognition, and found that it misidentified members of Congress as individuals who had been arrested for crimes.
The technology was used to match photos of people in a database of criminal mugshots, but it disproportionately misidentified people of color. This led to widespread criticism of facial recognition tools, with many arguing that they contribute to racial profiling and the violation of individuals’ right to privacy.
In response to growing concerns about the technology’s impact on civil rights, several cities in the U.S. have passed legislation banning the use of facial recognition in public spaces, and companies such as Amazon have been encouraged to suspend sales of the technology to law enforcement agencies.
Steps to Safeguard Your Privacy
Given the pervasive risks associated with AI-driven data collection, individuals must take steps to protect their privacy. The following strategies can help reduce your exposure and minimize the risks of AI misuse:
- Understand privacy policies. Start by carefully reviewing the privacy policies of the services and apps you use. Many services allow you to opt out of certain data collection practices, but you need to be proactive in setting your preferences.
- Use privacy-focused tools. Leverage privacy-enhancing tools, including VPNs, encrypted messaging apps such as Signal, and privacy-focused search engines such as DuckDuckGo. These tools can reduce your exposure to data collection by third parties and help anonymize your online activity.
- Opt out of data collection. Many websites and services offer ways to limit or opt out of data collection. Several organizations help consumers remove their data from commercial databases, limit its use in AI training, and opt out of unwanted marketing. Some examples of these websites are DeleteMe (a privacy service for removing personal information from online databases), OptOutPrescreen (a resource for opting out of prescreened credit and insurance offers), and Privacy Rights Clearinghouse (a nonprofit focusing on privacy education and resources).
- Advocate for stronger regulations. Support privacy regulations, such as the General Data Protection Regulation (GDPR) in the European Union (EU) and the California Consumer Privacy Act (CCPA), which offer individuals more control over their personal data.
- Limit data sharing. Be selective about what you share online and avoid linking multiple accounts to minimize data aggregation.
EU AI Act and U.S. Federal Proposals on AI Regulation
As AI technologies advance, regulatory frameworks are being developed to address the ethical, privacy, and safety concerns they raise. Both the EU and the United States have recently proposed comprehensive measures to govern AI to ensure privacy, increase accountability, and avoid misuse.
The EU AI Act: A Leading Regulatory Framework
The EU AI Act, proposed in April 2021, is the first comprehensive attempt to regulate AI at a systemic level. It categorizes AI applications into four risk levels—unacceptable, high, limited, and minimal—and implements varying degrees of oversight based on the potential risks posed by these technologies.