November 01, 2017

Building Ethical Algorithms

By Natasha Duarte

This summer, the Supreme Court declined to hear a case about the constitutional rights of a man whose sentencing decision was determined in part by a computer.1 In State v. Loomis,2 the state used a tool called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) to calculate the “recidivism risk” of the defendant—the likelihood that he would be rearrested for another crime within two years, based on people who shared his characteristics and background. COMPAS asks a series of questions about the defendant and runs the answers through a proprietary algorithm, which provides a recidivism “risk score.”3 A study by ProPublica found that when COMPAS predictions were wrong, they were more likely to incorrectly classify black defendants as high risk and white defendants as low risk.4

Whether they are used to target ads or to determine prison sentences, the algorithms behind automated decisions represent more than math. They also represent human choices, such as what data to use to make decisions, what variables to consider, and how much weight to assign to an algorithmic prediction. Each of these choices is value-laden and may lead to different outcomes. For example, crime prediction algorithms are often trained to “learn” patterns from historical arrest records (known as the “training data”). This prioritizes the historical patterns behind law enforcement’s decisions about where to patrol and whom to arrest. These tools can also be calibrated to minimize false positives or false negatives, depending on whether a jurisdiction would rather err on the side of keeping “low-risk” people in jail or letting “high-risk” people go free. For better or worse, choices about how to design and use decision-making algorithms are shaping policy, culture, and societal norms.

Industry and government have a responsibility to avoid building and using harmful automated decision-making systems. Several major tech industry players have made public commitments to lead the way on creating ethical standards for machine learning and artificial intelligence.5 Even smaller companies are beginning to hire chief ethics officers or even assemble entire teams to focus on ethical issues regarding the use of big data, machine learning, and automated decision-making systems. Slowly but surely, institutions that build and use automated decision-making systems are recognizing ethical review as a necessary prerequisite to the large-scale deployment of these systems.

Ethical review of automated decision-making systems is a complex and nuanced process. Yet companies and policymakers do not need to start from scratch when it comes to developing an ethical framework to guide this review. Several frameworks exist that have been or can be adapted to the automated computing context. Ethical principles are only as good as the processes in place to implement and enforce them. Companies need to adopt—and constantly reevaluate—internal processes for ethical design and review. The first section of this article discusses existing ethical frameworks that can be adapted to automated decision-making systems, and the second section is devoted to implementation strategies.

Ethical Frameworks

Several established frameworks provide ethical principles to guide organizations’ best practices around technology design and data use. Although these frameworks were not specifically designed for machine learning or artificial intelligence, they can be adapted to different technologies.

Belmont Report

The Belmont Report6 was commissioned in the 1970s, prompted by high-profile medical research scandals including the Tuskegee syphilis study7 and the Stanford prison experiment.8 The Belmont Report identified three key principles that continue to govern human subjects research: (1) respect for persons, (2) beneficence, and (3) justice. The first principle requires researchers to respect the basic dignity and autonomy of their subjects. Research subjects must be presented relevant information in a comprehensible format and then voluntarily give their consent to participate. “Beneficence” embodies the well-known maxim of “do no harm.” It requires researchers to conduct risk-benefit assessments, maximize the possible benefits to research subjects and to the public, and minimize possible harms. Researchers must also assess the specific risks and benefits of including members of vulnerable populations (such as children or pregnant women) in a study. The “justice” principle demands that the benefits and burdens of research are fairly distributed. The report notes that fair distribution does not always mean equal distribution. “[D]istinctions based on experience, age, deprivation, competence, merit and position do sometimes constitute criteria justifying differential treatment for certain purposes.”9

While the Belmont Report was developed to address medical research on human subjects, its principles are just as salient for big data analysis and automation. For example, consider the concept of informed consent. Imagine a user creates a social media account and agrees to a privacy policy stating that the information she discloses may be used for “research.” Has the user consented to allow the company, or outside researchers, to use that information to predict the user’s mental health status? Is separate consent required for this type of use, which arguably was not anticipated by the user when she created a profile? Researchers at traditional institutions must address consent issues when they seek institutional review board (IRB) approval to collect information from research subjects. However, big data analysis is often done without IRB approval for several reasons, including the ease of access to publicly available data sets (or data held by companies) and a lack of institutional clarity about whether big data research counts as human subjects research requiring IRB approval.10

Menlo Report

In 2012, the Menlo Report11 was commissioned in response to new questions about the ethics of information and communications technology research (ICTR). The Menlo Report identified three factors that make risk assessment challenging in ICTR: “the researcher-subject relationships, which tend to be disconnected, dispersed, and intermediated by technology; the proliferation of data sources and analytics, which can heighten risk incalculably; and the inherent overlap between research and operations.”12 Each of these factors also applies to data-driven automated decision-making systems. Because the data that feeds automated systems is collected and aggregated digitally, data subjects often do not know they are data subjects, and the effects of automated systems can be widely dispersed, difficult to detect, and difficult to connect to one particular system.

The Menlo Report builds on the principles articulated in the Belmont Report but accounts for the additional challenges presented by information technology. For example, “In the ICTR context, the principle of Respect for Persons includes consideration of the computer systems and data that directly . . . impact persons who are typically not research subjects themselves.”13 The report added a fourth principle calling for consideration of law and public interest. This principle asks researchers to engage in legal due diligence, transparency, and accountability.

ACM Software Engineering Code of Ethics and Professional Practice While the Belmont and Menlo Reports apply specifically to research, the Association for Computing Machinery (ACM) published a code of ethics in 1992 that applies generally to the practice and profession of software engineering. The ACM Software Engineering Code of Ethics and Professional Practice (ACM Code)14 recognized engineering as a profession and acknowledged that “software engineers have significant opportunities to do good or cause harm” and to enable or influence others to do good or cause harm. Among other things, the ACM Code requires engineers to act in the public interest, even when serving their clients or employers. Under the ACM Code, engineers’ responsibility to the public requires them to ensure that any software produced by engineers is safe, passes appropriate tests, and is ultimately in the public good; to disclose any potential danger to the user, the public, or the environment; to be fair and avoid deception concerning the software; and to consider issues of physical disabilities, allocation of resources, economic disadvantage, and other factors that can diminish access to the benefits of the software. Engineers must also report “significant issues of social concern” to their employers or clients. In turn, the ACM Code prohibits employers from punishing engineers for expressing ethical concerns.

Fair Information Practice Principles The Fair Information Practice Principles (FIPPs)15 are internationally recognized as the foundational principles for responsible collection, use, and management of data, and they continue to serve as guiding principles in the era of big data.16 There are many iterations of the FIPPs, but they were first codified in 1980 by the Organisation for Economic Co-operation and Development (OECD) in its Guidelines on the Protection of Privacy and Transborder Flows of Personal Data.17 Compliance with the FIPPs means minimizing the amount of data collected and the length of retention; ensuring that data is collected for a specified purpose and used only in contexts consistent with that purpose (unless additional consent is requested and given); giving individuals control over and access to their personal information, including the right to consent to its collection and use and to correct or delete it; and securing data through the use of encryption, de-identification, and other methods.

Common Principles

Together, these existing frameworks can provide a roadmap to guide ethical review of big data analytics, automated decision-making systems, and artificial intelligence. Below are some examples of how these common principles can apply to automated decision-making systems. This is not a comprehensive list of considerations to guide ethical review, and many of these questions do not have clear right or wrong answers, but they provide insight into key concepts and approaches.

Individual Autonomy and Control

Does the automated system reflect individuals’ privacy choices? For example, does it target ads by inferring sensitive characteristics—such as race, gender, or sexual orientation—that an individual has intentionally obscured or declined to disclose?

Beneficence and Risk Assessment

Are useful insights that companies glean from data passed on to data subjects in helpful ways? For example, information that is not obviously health related may be used to predict individuals’ propensity for certain health conditions.18 Whether these predictions should be communicated to individuals, and how best to do so, is a complex question requiring rigorous evaluation of the risks and benefits of disclosure.

Justice or Fairness

Are any groups or populations—especially protected or vulnerable classes—over- or underrepresented in the training data? Does the automated system rely on existing data or make assumptions about certain groups in a way that perpetuates social biases? Does the system create disparate outcomes for different groups or populations? Is the risk of error evenly distributed across groups?

Transparency and Accountability

Are the claims a company makes about its automated decision-making systems truthful and easy for users to understand? Do explicit and implicit statements allow users (or parties who contract to use the system) to form accurate expectations about how the system functions, its accuracy, and its usefulness?

Data Governance and Privacy

Is the data used to train a system accurate, complete, and up to date? Was the collection of data limited to only information needed to perform a specific function or solve a specific problem? Was personal information effectively deleted when it was no longer necessary or relevant? Were adequate steps taken to ensure that training data were not linked or reasonably linkable to an individual?

Professional Judgment

Are engineers trained and encouraged to spot and raise ethical issues? Are there specific mechanisms through which engineers and other employees can report ethical issues? Are there incentives (or disincentives) for doing so?

Implementing Ethical Review of Automated Decision-Making Systems

Creating a set of ethical principles or guidelines is a good start, but a review process must accompany it. The technology sector is beginning to recognize that conducting ex ante ethical reviews of automated decision-making systems is imperative, though industry has been slow to develop and share processes for doing so. This may be because these systems are still seen as new and experimental. The uncertainty surrounding the technology is all the more reason to shore it up with sound ethical risk assessment procedures. Even with ethical review, automated decision-making systems (like any technology) will have unintended consequences, some of them harmful. Sound internal processes can put companies in a better position to detect, remedy, and learn from harmful outcomes and avoid replicating them.

The technology at issue may be new, but the need for businesses to adopt internal processes to promote social good is not. Over the past few decades, industry and civil society have engaged in the development of process-based frameworks for putting human rights and corporate social responsibility principles into practice. These frameworks are useful for informing how companies can implement ethical principles into the design of automated decision-making systems before they are deployed in the wild.

UN Guiding Principles on Business and Human Rights

In 2011, the United Nations (UN) Human Rights Council published its Guiding Principles on Business and Human Rights,19 based on the “Protect, Respect, and Remedy” framework developed by UN Special Representative John Ruggie. The guidance includes operational principles for businesses, including (1) making a publicly available policy commitment to respect human rights; (2) assessing actual and potential human rights impacts that the business may cause or contribute to; (3) assigning responsibility for addressing human rights impacts to the appropriate people within the business; (4) tracking the effectiveness of the business’s response to human rights issues through qualitative and quantitative indicators, drawing on feedback from both internal and external sources, including affected stakeholders; (5) reporting publicly on how the business addresses human rights impacts, “particularly when concerns are raised by or on behalf of affected stakeholders”; and (6) providing remedies for adverse impacts.

Ranking Digital Rights Corporate Accountability Index

Since 2015, Ranking Digital Rights has evaluated companies’ respect for freedom of expression and privacy using its Corporate Accountability Index.20 The index includes measures of internal implementation mechanisms such as (1) employee training, (2) whistleblower programs, (3) impact assessments, (4) stakeholder engagement, (5) grievance and remedy mechanisms, and (6) public disclosure of implementation processes.

GNI Implementation Guidelines

The Global Network Initiative (GNI) Implementation Guidelines for the Principles on Freedom of Expression and Privacy provide details for how technology companies can protect and advance free expression and privacy rights.21 The guidance includes (1) oversight by the board of directors of the company’s human rights risk assessments, reporting, and response; (2) employee training; (3) impact assessments, including specific guidance building on the UN framework; (4) ensuring that business partners, suppliers, and distributors also comply with human rights principles; (5) management structures for integrating human rights compliance into business operations; (6) written procedures and documentation; (7) grievance and remedy mechanisms; (8) whistleblowing mechanisms; (9) multi-stakeholder collaboration; and (10) transparency.

Applying Human Rights Implementation Guidance to Automated Decision-Making Systems

The frameworks for implementing human rights principles share a set of common processes that can also support ethical design and use of automated decision-making systems. These common processes are (1) public commitments; (2) employee training; (3) risk assessment; (4) testing; (5) grievance and remedies mechanisms; (6) transparency measures, such as public reporting of ethical review processes or results; and (7) oversight. Here are examples of how these processes could be adapted to the automated decision-making context:

Public Commitments

Companies should make public commitments to uphold ethical principles in the design, training, review, testing, and use of the automated systems they build. These commitments should include a statement of the company’s ethical principles.

Employee Training

Companies should develop rigorous ethical training for employees and consultants who build automated systems—particularly engineers and product developers—based on real-world scenarios. Training should be specialized according to the type of company, its mission, and the products it creates, and should train engineers to become adept at spotting ethical issues on their own.

Risk Assessment

Companies should develop comprehensive risk assessments designed to anticipate potential negative or disparate impacts of automated decision-making systems on all individuals and groups likely to be impacted. Potential risks include but are not limited to privacy and data security, discrimination, loss of important opportunities (particularly in the contexts of finance, housing, employment, and criminal justice), and loss of autonomy or choice.

It is important that any risk-benefit assessments be realistic about the potential benefits of an automated decision-making system. For example, claims that collecting more data will help solve complex problems should have a fact-based rationale and should not automatically override concerns about data minimization and privacy.

Companies should conduct individualized risk assessments for each vulnerable population that could be affected by the outcomes of an automated system. The assessments should take into account historical marginalization, disparities in access to resources and justice, and other factors that might lead to harmful disparate impacts.

Risk assessments should also anticipate potential uses and abuses of an automated system by third parties. For example, a company selling an automated decision-making system to a government entity must assess the risk that the government entity will use the system in ways that the company may not have intended or in ways that create disparate impacts or violate rights.

Testing

Automated decision-making systems should be tested for limitations, such as disparate impacts on minority groups. The possibilities and best practices for testing (e.g., whether comprehensive testing before deploying an algorithm in the wild is practicable) may vary across tools and contexts. In some cases, access to the algorithm and testing by outside experts may be the best way to ensure fairness.

Grievance and Remedies

Companies should have clear and transparent mechanisms in place to receive and respond to grievances from individuals who believe they have been harmed by an automated decision-making system. For example, if a social media user believes her content was wrongfully flagged and removed by an automated tool for violating the platform’s terms of service, she should be able to report the issue to the company and receive an explanation of why the content violated the terms of service and an opportunity to appeal the decision.

Transparency

Companies should be open about their ethical review processes and mechanisms when practicable. Sharing these processes can help create industry guideposts, facilitate discovery and mitigation of adverse impacts, and foster trust between companies and the public.

Oversight

Several companies have recently taken the important step of hiring a chief ethics officer and—even better—assembling a team dedicated to ethical assessment and review. These teams should have full access to information about projects and the authority to recommend changes to or even halt projects that do not meet ethical standards.

Case Study: Airbnb’s Inclusion Team

During the past year, the home-sharing company Airbnb has made a number of internal changes aimed at reducing discrimination on its platform. After struggling with accounts of racial discrimination against would-be guests (those seeking accommodations), the company reviewed its practices and policies to see what structural changes might help reduce racial bias on the platform.22 One of those changes was the creation of a permanent team of engineers, data scientists, researchers, and designers whose sole function is to advance inclusion and root out bias. A key function of this team is assessing what information in a would-be renter’s profile—such as photo and name—might trigger a racially biased decision to reject the renter, and whether highlighting other information could mitigate that bias. The team is experimenting with reducing the prominence of profile photos and highlighting objective information like reservation details, reviews of would-be guests from previous hosts, and verifications. For guests who do not have reviews or verifications, the team is exploring how it can use design to improve messaging and social connections to build trust between hosts and would-be guests. The company has also overhauled its processes for receiving and responding to discrimination complaints, making it easier for users to flag and get help for potential instances of discrimination.

In this case, Airbnb is tackling individual human bias rather than algorithmic bias, but the lessons it has learned apply to automated decision-making systems as well: mitigating bias requires an internal acknowledgment of concerns, dedicated personnel, constant evaluation, and thoughtful internal mechanisms. Airbnb has acknowledged that there is no single solution to addressing bias and discrimination on its platform, and it is still experimenting with different approaches.

Conclusion

Automated decision-making systems are more than neutral lines of code. They are agents of policy, carrying out the values embedded in their design or data. They get their values from the choices made by the humans who create them—whether those value choices are conscious or not. A well-developed ethical review process is not a silver bullet for preventing unfair outcomes. But it is necessary if we hope to build systems that promote democracy, equality, and justice. 

Endnotes

1. Loomis v. Wisconsin, 137 S. Ct. 2290 (2017).

2. 881 N.W.2d 749 (Wis. 2016).

3. See Petition for Writ of Certiorari, Loomis, 137 S. Ct. 2290 (No. 16-6387), http://www.scotusblog.com/wp-content/uploads/2017/02/16-6387-cert-petition.pdf; Julia Angwin et al., Machine Bias, ProPublica (May 23, 2016), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

4. Angwin et al., Machine Bias, supra note 3; Julia Angwin & Jeff Larson, Bias in Criminal Risk Scores Is Mathematically Inevitable, Researchers Say, ProPublica (Dec. 30, 2016), https://www.propublica.org/article/bias- in-criminal-risk-scores-is-mathematically-inevitable-researchers-say.

5. John Markoff, How Tech Giants Are Devising Real Ethics for Artificial Intelligence, N.Y. Times, Sept. 1, 2016, https://www.nytimes.com/2016/09/02/technology/artificial-intelligence-ethics.html.

6. Nat’l Comm’n for the Protection of Human Subjects of Biomedical & Behavioral Research, The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research (1979) [hereinafter Belmont Report], https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html.

7. The Tuskegee Timeline, Ctrs. for Disease Control & Prevention, https://www.cdc.gov/tuskegee/timeline.htm (last updated Aug. 30, 2017).

8. Stanford Prison Experiment, http://www.prisonexp.org/ (last visited Oct. 17, 2017).

9. Belmont Report, supra note 6.

10. See, e.g., Kalev Leetaru, Are Research Ethics Obsolete in the Era of Big Data?, Forbes (June 17, 2016), https://www.forbes.com/sites/kalevleetaru/2016/06/17/are-research-ethics-obsolete-in-the-era-of-big-data/.

11. David Dittrich & Erin Kenneally, U.S. Dep’t of Homeland Sec., The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research (2012), https://www.caida.org/publications/papers/2012/menlo_report_actual_formatted/menlo_report_actual_formatted.pdf.

12. Id. at 5.

13. Id. at 7.

14. Software Engineering Code of Ethics and Professional Practice, Ass’n for Computing Machinery (1999), http://www.acm.org/about/se-code.

15. See, e.g., Org. of Econ. Co-operation & Dev., The OECD Privacy Framework 70–72 (2013) [hereinafter OECD Privacy Framework], http://www.oecd.org/sti/ieconomy/oecd_privacy_framework.pdf.

16. See Robert Gellman, Fair Information Practices: A Basic History (Apr. 10, 2017) (unpublished manuscript), https://bobgellman.com/rg-docs/rg-FIPshistory.pdf.

17. OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data, Org. of Econ. Co-operation & Dev. (1980), http://www.oecd.org/sti/ieconomy/oecdguidelinesontheprotectionofprivacy andtransborderflowsofpersonaldata.htm. These guidelines were updated in 2013. OECD Privacy Framework, supra note 15.

18. Deepika Singhania, This Startup Uses AI to Predict Lifestyle Disease Risks, YourStory (Apr. 20, 2017), https://yourstory.com/2017/04/tissot-signature-innovators-club-march-winner/.

19. U.N. Human Rights Council, Guiding Principles on Business and Human Rights, U.N. Doc. HR/PUB/11/04 (2011), http://www.ohchr.org/Documents/Publications/GuidingPrinciplesBusinessHR_EN.pdf.

20. 2017 Indicators, Ranking Digital Rts., https://rankingdigitalrights.org/2017-indicators/ (last updated Sept. 14, 2016).

21. Implementation Guidelines for the Principles of Freedom of Expression and Privacy, Global Network Initiative, http://globalnetworkinitiative.org/sites/default/files/Implementation-Guidelines-for-the-GNI-Principles_0.pdf (last visited Oct. 17, 2014).

22. Airbnb hired Laura Murphy to lead the review, and Murphy assembled a cross-department internal team as well as outside experts. Together, they reviewed aspects of Airbnb such as how hosts and guests interact, the company’s written policies, enforcement of the policies and response to complaints of discrimination, and the (lack of) diversity on the Airbnb team. Laura W. Murphy, Airbnb’s Work to Fight Discrimination and Build Inclusion (2016), http://blog.atairbnb.com/wp-content/uploads/2016/09/REPORT_Airbnbs-Work-to-Fight-Discrimination-and-Build-Inclusion.pdf.

Entity:
Topic:

By Natasha Duarte

Natasha Duarte (Natasha@cdt.org) is a policy analyst at the Center for Democracy & Technology.