Welcome back to the third installment of Artificial Intelligence 101. In this article, we will explore AI bias, which is another AI limitation.
What Is AI Bias?
According to IBM, AI bias “refers to AI systems that produce biased results that either reflect or perpetuate human biases within a society, including historical and current social inequality.” When unaddressed, bias prevents meaningful participation in society and the economy. It also reduces AI’s potential.
There are several different types of bias: selection bias, coverage bias, sampling bias, confirmation bias, and algorithmic bias. Selection bias occurs if a dataset’s examples are chosen in a way that is not reflective of their real world distribution. Selection bias can take different forms, including coverage bias and sampling bias. Coverage bias occurs if data is not selected in a representative way, for example, when a model is trained to predict future sales of a new product based on phone surveys conducted with a sample of customers who bought the product. Consumers who decided to buy a competing product were not surveyed, as a result, this group was not represented in the training data.
Sampling Bias
Sampling bias occurs if proper randomization is not used during data collection. Sampling bias occurs, for example, when a model is trained to predict future sales of a new product based on email surveys with a sample of consumers who bought a competing product. Instead of randomization techniques, the surveyor choses the first 200 consumers that responded to an email, who may have been more enthusiastic about the product than average purchasers.
Confirmation Bias
Confirmation bias occurs when a model’s builders unconsciously process data in ways that align with pre-existing beliefs and hypotheses. An example is illustrative: A machine learning engineer (a type of engineer who researches, builds, and designs AI systems that leverage data sets to generate and develop algorithms that can learn and eventually make predictions) is building a model that predicts aggressiveness in dogs based on features such as breed and environment. The engineer has a negative encounter with a toy poodle, and since then, has associated the breed with aggression. In curating the model’s training data, the engineer unconsciously discarded features that provided evidence of docility in smaller dogs, thus skewing the results.
Algorithmic Bias
Finally, algorithmic bias refers to the systemic and repeatable errors in a computer system that create unfair or discriminatory outcomes. This type of bias can reinforce existing socioeconomic, racial, and gender identity biases.