Correcting Severe Errors Made by Machine Translation
Topic: Machine Translation
This article intends to show a few examples of severe errors made by machine translation engines that most of us want to prevent or correct. First, I will try to categorize what I would consider as a severe error created by MT:
- Errors with economic consequences for the company
- Errors due to offensive words
- Errors with legal or safety consequences
Errors with economic consequences
Economic consequences come from errors that prevent a customer from doing business with the company. For eBay, these would be mostly issues that prevent buyers from buying. Customers of eBay start buying by entering a query to search for the items that they want to buy. That query entered in their native language is translated into English, for example, and then the English query goes on to find items. So it is critical that the translation of the query is appropriate to find the best results possible. When a query is translated in a way that does not bring results, this becomes a severe error because the customer is not buying from that translation.
Our example comes from Brazilian Portuguese: when searching for an iPhone case, Brazilians will enter the term “capinha” for “case”, which is a diminutive form of the word. Most of the corpora used to train most MT engines may come from formally written sources, and these sources do not use diminutives very often. As a result, the translation of “capinha” from Portuguese into English may not translate into ”case”; actually, it may not translate it at all. The query for that search produces no results and this becomes an important error. This is something that we fixed and made our Brazilian customers happy.
Another type of error that could have economic consequences would be the translation of “free shipping” as “paid shipping”, or the translation of “seller pays for shipping” as “buyer pays for shipping”. This could result in less buying. However, we haven’t seen this happen.
Errors due to offensive or inappropriate words
Words could be offensive or inappropriate for being explicit language or have sexual connotations. We have these examples:
Consider the word “female”. In many languages, the word female is translated differently if you are referring to a person or to an animal (or a mechanical part). If you are referring to a person, the translation for “female jacket” should sound like “feminine jacket” or “jacket for women”. If you are referring to an animal, the word is more on the anatomic side, expressing the idea of something being physically female. Preferably, the MT engine should not translate a female jacket as “anatomically” female and should translate with the meaning of a style. This is an issue that we found across several languages. To illustrate this, here is what happens in two languages.
The other example is language-specific. There is a doll called Adora doll, where Adora is a brand. It turns out that adora is a word in Portuguese that means that you “adore”, that you “love” someone. The translation for “boneca Adora” coming from Portuguese turned into “love doll” in English, and the results of a search for “love doll” may not be the most suitable if someone is looking for a doll for children.
Errors with legal or safety consequences
Errors of legal or safety nature could come from converting units of measurement from one system (English units) to another (metric). This kind of issue is critical in medical translations, where the dosage of a medication in the English system turned into a metric unit, keeping the same number, could risk a person’s life! This is less likely to be an issue on an e-commerce environment. Also, our engine is trained to keep the same unit system, so we have not seen issues of that nature.
If you know other examples of severe errors made by machine translation, we would love to hear from you and improve our system.
And if you enjoyed this article, please check other posts from the eBay MT Language Specialists series.