Artificial intelligence, human errors

When AI gets it wrong: why it happens, and how to avoid it.

Francesca Randone | Researcher, AI Lab, Department of Mathematics, Informatics and Geosciences, University of Trieste
Pixabay

'I am not afraid of machines that replace man, I am afraid of machines that do not work'. This was said by Michael Irwin Jordan, one of the fathers of modern machine learning, speaking at Trieste Next, the festival of scientific research being held in the Friulian city at the end of September.

In the vast panorama of artificial intelligence applications, the greatest risks seem to derive precisely from the malfunctioning of algorithms, rather than from their 'human wake-up call'. Errors can arise either from a lack of 'robustness' of the algorithms, which, when faced with very similar instances of the same problem, sometimes give very different answers, or from the wrong assumptions on which the machines rely when solving problems. For example - cases that have actually occurred - image recognition algorithms classify images that are indistinguishable to the human eye with different labels (such as 'cat' and 'crane') or, having only had access to photos of dogs in the snow, conclude that every animal in the snow is a dog. These types of errors are intrinsically linked to the fact that machines, as such, have less capacity for abstraction than the human brain, and can make mistakes that are trivial and almost amusing to our eyes.

However, there are also cases where algorithms err because they are 'too human'. The impartiality and objectivity that we would expect from an artificial intelligence algorithm are lost if it 'absorbs' human biases during training. This kind of error is even more insidious to track down: after all, even humans find it difficult to tell if a decision conceals a bias of some kind. In algorithms, this is complicated by their being 'black boxes', whose internal mechanisms are mostly incomprehensible to humans. The consequences of these errors can be more or less serious, so much so that there is now a debate on how to guarantee so-called 'algorithmic justice', i.e. ensuring that everyone's rights are respected even when decision-making processes require the use of automatic decision-making algorithms.

Lessons from the past

That computer programmes are not exactly impartial has been known since the 1980s. In a notorious case from those years, a programme used at St George's Medical School in London to select candidates was found to discriminate against ethnic minorities and women. The programme, used to decide who would have access to an interview, scored candidates on the basis of certain information that, in theory, contained no explicit reference to ethnicity. Despite this, a commission of enquiry in 1988 found that the algorithm consistently assigned lower scores to women and minorities, inferring the latter from their surname and place of birth.

As noted by the  British Medical Journal, although the programme had not been conceived as discriminatory, it had ended up reflecting the discrimination already present in the selection system implemented by humans. In fact, the programme had been developed with the aim of replicating the scores awarded in the past by a human committee. Already thirty years ago, therefore, the same journal article recommended that any institution intending to use an automatic decision algorithm to carry out a selection should take the responsibility of verifying its functioning, precisely in order to avoid further episodes of this kind. Unfortunately, the warning fell on deaf ears. 

Hidden prejudices

In 2016, the topic of discriminatory algorithms came back into the limelight thanks to an investigation by the US journalism agencyProPublica.According to ProPublica, COMPAS, a programme used in the courts of several US states to assess defendants' risk of reoffending, exhibited racist behaviour. Specifically, the programme consistently overestimated the risk of recidivism for African-Americans, while underestimating it for white defendants, even with the same crime and criminal record. Again, theoretically, the algorithm had no access to information on the ethnicity of the defendant but, according to ProPublica, this was inferred from the 137 questions on which the algorithm based the risk score. 

In a country with a long history of racial inequality, leaving the assessment of the risk of recidivism to a theoretically impartial and objective algorithm seems like a more than fair idea. But then why should an algorithm developed with such good intentions be racist? There is no certainty about this, not least because Northpointe, the company that owns COMPAS, has refused to release the source code. However, as we know, algorithms are 'trained' on historical data, and just as at Saint George Medical School the selection programme reflected the scores given in past years by human committees, COMPAS was probably also trained on human evaluations, ending up incorporating in its operation the very biases it was designed to avoid.

And manifest prejudices

There are also cases where programmes can talk, declaring their own biases to the world. In 2016, Microsoft released Tay, a chatbot on Twitter that was supposed to interact with users by simulating the responses of an American girl of about 19 years of age. In Microsoft's idea, by interacting with users, Tay would learn to converse in an increasingly natural way on more and more topics. Too bad, however, that not all users were well-intentioned. In a coordinated attack, some trolls started to storm Tay with provocative and politically incorrect messages. Since the chatbot had been programmed to learn from interactions and to repeat the learned expressions, within only sixteen hours, the friendly and polite Tay went from greeting new users with joy to hating everyone, even to stating that 'Hitler was right'. At which point Microsoft withdrew Tay from Twitter with a public apology. 

Just to avoid further unpleasant episodes, Zo, the chatbot successor to Tay, avoided at the outset engaging in any conversation containing potentially uncomfortable keywords. Today, fortunately, ChatGPT seems to be doing better.

Algorithmic justice 

In the world of research, the iconic name in the battle for algorithmic justice is that of computer scientist Joy Buolamwini. While a student at MIT's Media Lab, Joy worked on an algorithm that would project a mask onto the faces of people framed by a webcam. Of Ghanaian origin, Joy had noticed that, unlike her white colleagues, the algorithm could not identify her face unless she was wearing a white mask. Similar experiences had occurred to her in the past, leading her to conclude that the problem was the databases on which most facial recognition algorithms, such as the one used in her project, were trained. In fact, in most of these databases, most of the faces are of white men. The result is that, during training, the algorithms 'see' many faces of white men, but only a few faces of women and people from ethnic minorities (lo and behold, here comes racial discrimination again). When confronted with a new face, the algorithms will find it more difficult to recognise it if it belongs to one of the least seen categories, exactly as was the case with Joy's face.

The Coded Gaze, the video in which Joy Buolamwini documents the bias of algorithms.

Joy quickly realised that the problem goes far beyond facial recognition programmes: knowing whether an algorithm differentiates on the basis of ethnicity, gender, or any other information, can have huge consequences on people's lives. So, in 2016, Joy founded theAlgorithmic Justice Leaguewith the aim of raising awareness and spreading awareness on issues of algorithmic justice. Today, AJL operates on various communication channels and has a documentary ('Coded Bias') and several publications to its credit. 

Europe against mistakes

2016 was not only an important year in terms of activism. In that year, the European Union adopted the General Data Protection Regulation, GDPR, which came into force in 2018. The regulation updates European data processing regulations, but also addresses automated decision-making processes, i.e. precisely those in which an algorithm is responsible for a decision. In the regulation, EU member states have signed up that "a decision based solely on automated processing, including profiling, which produces an adverse legal effect or significantly affects the data subject shall be prohibited unless it is authorised by Union law or by the law of the Member State to which the data controller is subject and provides adequate safeguards for the data subject's rights and freedoms, at least the right to obtain human intervention by the data controller". 

Although it is not clear how this step will be applied, its objective is: to limit the application of fully automatic, i.e. unsupervised human decision-making algorithms. If authorised, the person on whom the decision has effect will have to be given adequate guarantees about the algorithms, including that the decision will be delegated to a human being upon request. In short, if you ask, you are entitled to hear the opinion of a live doctor or financial advisor, and not just that of the programme. This may not sound like much, and many have noted the lack of a real 'right to an explanation' obliging those using an algorithm to explain how it reaches their decisions (i.e. what was hoped for in 1988). However, the GDPR is certainly an important step forward in what is an almost entirely new topic in the history of jurisprudence.

The GDPR was followed by the European Artificial Intelligence Regulation, which came into force in 2024. Also known as the 'AI Act', this second document classifies artificial intelligence systems according to the potential risks they may pose, and stipulates that vendor companies must ensure stricter security requirements for systems deemed to be more high-risk, such as medical software or those used for staff recruitment. The law will come fully into force in the next two years and can be seen as a further step in the regulation of these technologies.

Meanwhile, outside the laboratories, what can we do? As users of the products of artificial intelligence, it is our responsibility and duty to motivate the world of research to keep an eye out for any violation of rights by cultivating awareness and promoting a critical and shrewd use of the tools provided to us. Each of us can, in short, contribute to shaping the future of artificial intelligence.

You might also be interested in

SocietyTechnology and Innovation

Artificial intelligence, human errors

When AI gets it wrong: why it happens, and how to avoid it.

SocietyTechnology and Innovation

How smart is ChatGPT?

Q&A with the artificial intelligence software of the moment.

Cultures

How does it feel to see human remains in the museum?

A survey analyses visitors' opinions, amid ethical doubts and uncomfortable historical legacies.

Cultures

Face to face with death at the museum

The display of human remains in museums is the focus of a research project at the IMT School.