As artificial intelligence creeps out of data labs and into the real world, we find ourselves in an era of AI-driven decision-making. Whether it’s an HR system helping us sort through hundreds of job applications or an algorithm that assesses the likelihood of a criminal becoming a recidivist, these applications help shape our future.
AI-based systems are more accessible than ever before. And with the growth of AI adoption throughout industries, further questions arise surrounding fairness and how it is ensured throughout these systems. Racial, age, and gender biases are forms of AI bias. Understanding how to avoid and detect bias in AI models is a crucial research topic, and increasingly important as AI continuously expands to new sectors.
“AI Systems are only as good as the data we put into them.”
Champion AI Fairness
AI builds upon the data it is fed. While AI can often be relied upon to improve human decision-making, it can also inadvertently accentuate and bolster human biases. What is AI bias? AI bias occurs when a model reflects implicit human prejudice against areas of race, gender, ideology and other characteristic biases.
Google’s ‘Machine Learning and Human Bias’ video provides a tangible example of this idea. Picture a shoe. Your idea of a shoe may be very different from another person’s idea of a shoe (you might imagine a sports shoe whereas someone else might imagine a dressy shoe).
Now imagine if you teach a computer to recognize a shoe, you might teach it your idea of a shoe, exposing it to your own bias. This is comparable to the danger of a single story.
“The single story creates stereotypes, and the problem with stereotypes is not that they are untrue, but that they are incomplete. They make one story become the only story.”Chimamanda Ngozi Adichie
So, what happens when we provide AI applications with data that is embedded with human biases? If our data is biased, our model will replicate those unfair judgements.
AI Bias Examples
Here are three examples of AI replicating human bias and prejudice:
- Hiring automation tools: AI is often used to support HR teams by analyzing job applications and some tools rate candidates through observing patterns in past successful applications. Where bias has appeared is when these automation tools have recommended male candidates over female, learning from the lack of female presence.
- Risk assessment algorithms: courts across America use algorithms to assess the likelihood of a criminal re-offending. Researchers have pointed out the inaccuracy of some of these systems, finding biases against different races where black defendants were often predicted to be at a higher risk at re-offending than others.
- Online social chatbots: several social media chatbots built to learn language patterns, have been removed and discontinued after the posting of inappropriate comments. These chatbots, built using Natural Language Processing (NLP) and Machine Learning, learned from interactions with trolls and couldn’t filter through indecent language.
The three scenarios above illustrate AI’s potential to be biased against groups of people. And the key underlining factor of these results is biased data. Although inadvertently, the systems did exactly what they were trained to do — they made sense of the data they were given.
Data reflects social and historical processes and can easily operate to the disadvantage of certain groups. When trained with such data, AI can reproduce, reinforce, and most likely exacerbate living biases. As we move into an era of AI-driven decision-making, it is most crucial to understand biases that exist and take preventive measures to avoid discriminatory patterns.
Eliminate AI Bias Through Early Detection
Understanding types of biases and how to detect them is essential to help ensure fairness. Google identifies three categories of biases:
- Interaction bias: when systems learn biases from the users driving the interaction. For example, chatbots, when they are taught to recognize language patterns through continued interactions.
- Latent bias: When data contains implicit biases against race, sexuality, gender etc. For example, risk assessment algorithms which show examples of race discrimination.
- Selection bias: When the data you use to train the algorithm is over-represented by one population. For example, where men are over-represented in past job applications and the hiring automation tool learns from this.
So how can we become more aware of these biases in data? In Machine Learning literature, ‘fairness’ is defined as “A practitioner guaranteeing fairness of a learned classifier, in the sense that it makes positive predications on different subgroups at certain rates.” Fairness can be defined in many ways, depending on the given problem. And identifying the criteria behind fairness requires social, political, cultural, historical and many other tradeoffs.
Let’s look at understanding the fairness of defining a group to certain classifications. For example, is it fair to rate different groups loan eligibility even if they show different rates of payback? Or is it fair to give loans comparable to payback rates? Even for a scenario like this, people might disagree as to what is fair or unfair. Understanding fairness is a challenge and even with a rigorous process in place, it’s impossible to guarantee. And, for that reason, it is imperative to measure bias and, consequently, fairness.
Strategies of measuring bias are present across all society sectors, in cinema for example the Bechdel test assesses whether movies contain a gender bias. Similarly, in AI, means of measuring bias have started to arise. Aequitas, AI Fairness 360, Fairness Comparison and Fairness Measures, to name a few, are resources data scientists can leverage to analyze and guarantee fairness. Aequitas, for example, facilitates auditing for fairness, helping data scientists and policymakers make informed and more equitable decisions. Data scientists can use these resources to evaluate fairness and help make their predications more transparent.
The Equity Evaluation Corpus (EEC) is a good example of a resource that allows Data Scientists to automatically assess fairness in an AI system. This dataset, which contains over 8,000 English sentences, was specifically crafted to tease out biases towards certain races and genders. The dataset was used to automatically assess 219 NLP systems for predicting sentiment and emotion intensity. And interestingly, they found more than 75% of the systems they analyzed were predicting higher intensity scores to a specific gender or race.
Sources for Unbiased AI Training Data
As more AI systems are deployed, and at a faster rate, implications of underlying bias arise. To address this concern, DefinedData’s catalog now offers detailed information on the gender, age, accent, and phonetic distribution of datasets as well as meta-data on the recordings and audio samples. Consideration for all types of individuals impacted by AI systems makes a future where AI integration within societies is accepted and seamless. It also helps to remember that consumers hold companies accountable. Therefore, companies with responsible AI practices benefit as do their customers.
Learn more about AI fairness and ways to avoid harmful bias outcomes with these resources: