An Evolving Discussion
AI and Bias
By Dr. Michiel Baas
As Artificial Intelligence (AI) relies on data, there is no avoiding the issue of bias in large datasets. Although we like to think of AI as objective — a clean, mathematical system that sorts out the messiness of human decision-making — its output is often subjective. Every algorithm carries traces of the people and histories it draws from. Datasets carry in them the history of how it was sourced and the choices that were made in terms of its curation and labelling. It therefore always also reflects the human hands that were involved in this process. As AI is no longer science fiction, the biases it reproduces impacts all of us. This means that we remain extra vigilant of the possibility of misrepresentation, discrimination and racism it may facilitate.
What Bias Really Means
When we say AI is “biased,” we don’t mean it has opinions. The bias seeps in through data — the stuff we feed it. Machines learn from the past, and the past, as we know, is not neutral. It is layered with ideas, preferences and opinions that are uniquely shared. If an algorithm is trained on historical hiring data, for example, it may learn that men are preferred over women, and simply repeat that pattern.Researchers talk about many types of bias. Data bias happens when the information used to train a model leaves people out. This is common in India, where much of the data comes from urban or English-speaking users. Selection bias creeps in when a dataset doesn’t reflect the real world. Think of a health AI trained mostly on patients from Delhi or Mumbai — it might fail when used in rural Bihar or Karnataka. Issues of pollution are much greater in megacities while the physical demands of labour and access to care pose different challenges to those in rural areas. Then there’s confirmation bias, which arises when developers, often unwittingly, choose data that supports what they already believe. Often though, the bias lies not in the data but in the design goal itself. Optimizing for “efficiency” might mean cutting corners that hurt those already at a disadvantage.
How It Shows Up
You can see AI bias almost everywhere once you start looking. In recruitment, automated résumé screeners have been found to favour male candidates or graduates from certain elite institutes. Ageism has seeped into this as well. Age is not only used to predict physical fitness but also the likelihood of pregnancy among female candidate. It makes that increasing older men and younger women face reduced chances to be selected for an interview. Crucial here is what past hiring data itself reflects. Yet what also matters is how aware those building and using these systems are of the potential for bias.In finance, algorithms used for digital lending may penalize people from lower-income areas or those with thin credit histories. While from the position of a bank this may make sense, it also exacerbates a situation which is already characterized by less opportunities. Moreover, it may not tell the entire story. AI functions on the basis of datasets but can’t ask for additional information or give a candidate for a job or loan the chance to speak. Those seeking to make an intervention in this thus need to take a critical look how they use a particular dataset and the trust they put in AI. As Madhumita Murgia’s Code-Dependent: Living in the Shadow of AI (2024) reminds of us, automated systems are reshaping lives all over the world. How much trust are we willing to put in them? Who is in control and what remains of our autonomy and agency in all this?
Facial recognition systems, now being rolled out in airports and city surveillance networks everywhere, raise similar worries. Many of these tools are built using Western datasets. However, studies abroad show they misidentify darker-skinned faces more often. Joy Buolamwini’s Unmasking AI: My Mission to Protect What Is Human in a World of Machines (2023) offers crucial insight here. Imagine the risks if such errors occur in a country as diverse as Brazil or India, especially when linked to law enforcement. Automated welfare systems meant to distribute relief sometimes leave out the most vulnerable — people with inconsistent records, names spelled differently, or no internet access. Bias doesn’t always look like discrimination or racism; sometimes it’s just exclusion.
The pandemic was particularly revealing for this. It put the spotlight on how systems that make use of AI often exclude those who need assistance the most. A fundamental problem here is that not all people are represented equally or even exist in datasets. AI always requires human attention and monitoring; there is an urgent need for all of us to stay vigilant. This also goes for daily tech routines. Voice assistants struggle with accents and regional languages. Translation apps often turn neutral Hindi phrases into gendered English ones — “doctor” becomes “he,” “nurse” becomes “she.” And social media algorithms, obsessed with engagement, can end up amplifying angry or polarizing posts because outrage simply keeps people clicking.
What It Means for Us
A biased system could mean a wrong prediction in a hospital, a job lost or a loan denied. More broadly, it can deepen socioeconomic inequality and diminish trust in technology. If people start to believe that “smart” systems are biased and unfair, they’ll stop using them.In a country as varied as India, bias has another consequence: it amplifies and widens the digital divide. When tools are designed mainly for highly-educated English-speaking urban users, rural citizens and speakers of smaller languages are left behind. The risk is that AI could reinforce hierarchies under a new digital label.
Why It Keeps Happening
Fixing bias isn’t merely a problem of coding — it’s a social and cultural one. The data most algorithms rely on comes from those easiest to measure: people who are (frequently) online and make use of smartphones for an array of services and tools. Many models used in the Global South are imported from or trained in the West, on datasets that don’t understand local realities. Besides that, the tech industry itself still lacks diversity in social and cultural backgrounds, language, and lived experience.Transparency is another issue. Companies rarely reveal how their algorithms make decisions. A job applicant might never know why their résumé was rejected. A citizen might not realize that a computer, not a person, denied their welfare claim. Without openness, accountability is impossible.
The Human Mirror
There’s hope, though. Data scientists from India and elsewhere are pushing for more representative datasets — ones that include regional languages, rural populations, and different social/cultural groups. Governments in the Global South have also begun drafting rules for ethical AI, emphasizing transparency and fairness. Some companies are conducting independent algorithmic audits to check for bias before systems go public. Payal Arora’s From Pessimism to Promise: Lessons from the Global South on Designing Inclusive Tech (2024) shows an important way forward here but also underlines how technology alone won’t fix the problem. Bias reflects society, and therefore all of us collectively. To make AI fair, we have to make sure the world it learns from is fair too. This is not only about inclusivity and diversity but also the willingness to question one’s own assumptions.In the end, AI bias is not just a technical glitch. It’s a mirror that draws on code and data that reflects us, uncomfortably, who we already are. The challenge isn’t to make machines perfect; it’s to make ourselves more just. In this, we need to stay aware and vigilant of how we equate with the data AI learns from. The bias in AI is the bias of all of us!