In conversation with Michal Měchura When the machine asks the human
Language technologist Michal Měchura has always wanted a tool that allows machine translators to ask humans questions about ambiguous phrases. With Fairslator, he has developed such a tool himself. Michal talks to us about bias and ambiguity in automatic translations.
By Stephanie Hesse
Michal Měchura, one of the first sentences on the fairslator.com website reads: “We need to talk about bias in machine translation.” Why is that?
We need to talk about bias in machine translation because this problem actually exists. When you look at the entire history of machine translation its ambition has always been to provide you with a very straightforward and linear user experience where you just put the text in one language and it comes out in another. It creates the false impression that there is always one correct translation. The reality, however, is not that simple: You have ambiguities in the source text that sometimes you can’t resolve automatically. You have to ask the human users what they mean by certain things.
The existing machine translator is making some sort of assumption and this is basically the cause of bias. It is a problem that has been bothering me for a long time. I always had a problem with the fact that machine translators never asked me how I meant certain things and I always wished that a tool existed that would do that, until I built one myself.
Can you describe in a few words what Fairslator does and how it works?
Fairslator is a plug-in that you can use on top of an existing machine translation engine such as Google Translate, DeepL or Microsoft Translator. At the moment, Fairslator works translating from English into German, Czech or Irish. It basically takes the output of machine translation and tries to detect instances of biases in it, where the machine translator has decided on one specific reading but where other readings are possible, too. Then, it will give you options how to translate this ambiguity.
Let’s take an example. The translation might be biased by the male reading of the word ‘teacher’. There are many cases where such kinds of ambiguity cannot be resolved from the text itself. Most machine translators make a choice based on what has been seen statistically more often in the training data. Instead, Fairslator actively disambiguates by asking the human user: Do you want to translate the word ‘teacher’ as ‘male teacher’ (Lehrer) or ‘female teacher’ (Lehrerin)? It helps you translate things that are more in accordance with what you actually mean instead of what the machine thinks you probably mean.
Is this what you call a ‘human-in-the-loop translator’?
Exactly. Bringing humans into the loop is a trend we have seen in AI more recently. It turns out that certain things are too difficult for AI to figure out, so you need humans to intervene. This is exactly what Fairslator is doing. You might remember the time before machine translation was a big thing. People were investing a lot of effort into computer-assisted translation, where humans translated with software running in the background, constantly suggesting how you might want to translate certain things. The human still was in control, they had to accept, reject, or edit these translations. Now the balance has shifted, and we have systems that completely automatically translate texts. Maybe we need to bring the human back into the equation. I also call it “human assistance machine translation”.
What kinds of bias interest you in this project? Can you give some examples?
Translations produced by machines are often biased because of ambiguities in gender, number and forms of address. For example, when translating from English into French, should student be translated as male ‘étudiant’ or female ‘étudiante’? Should ‘you’ be translated as informal ‘tu’ or formal ‘vous’? Biases in translation can happen in any aspect of meaning, but I’m concentrating on the ones that come up very often. Even within this slightly smaller subset of “sort of bias”, it is still a very broad and complex area. Research about gender bias in machine translation has concentrated on the easy cases, the low-hanging fruits, such as the gender pairs of nouns. But in many languages, a lot of other classes of words are affected by gender bias. Mostly Slavic languages but also in some Romance languages such as French you must adjust adjectives and the past participle. These are just a few examples that show how complex this topic is.
How does Fairslator deal with intended gender-neutral language?
Fairslator has a feature that facilitates gender-neutral language. For German nouns for example, it creates gender starlets or other neo forms automatically. Sometimes it can be difficult to get the job done because adjectives, articles, or other words in the sentence need to agree grammatically. But Fairslator does its best to come up with gender-neutral forms.
Machine translation tools are aware of these kinds of biases and work on finding a solution. For instance, suggesting alternative translations such as ‘Arzt’ or ‘Ärztin’ to translate the word doctor. What do you think about this solution?
Indeed, machine translators have started offering you more options. The problem I see is that they offer these options without explaining the differences between them. This is not useful, especially if you don't know the language you are translating into. I spent a lot of time coming up with a user experience that is “baby proof”, so even people who have never heard that other languages have male and female words will be able to understand what the difference is. It’s about disambiguating information in the source language and not in the target language.
What are your plans for Fairslator in the coming months and years?My long-term vision is that this kind of functionality, which Fairslator offers, would become the standard in all machine translation systems.
My even longer-term vision is: I want people to stop talking about translation all the time. Instead of asking how to translate something correctly, you should ask yourself what you actually want to say and express yourself as clearly as possible already in the source language. Start talking about expressing ideas simultaneously in many different languages. It’s worthwhile to step back and let this shift happen in people’s minds.