Well-kept translation memory secrets


Ce contenu est offert en anglais seulement.

André Guyon
(Language Update, Volume 8, Number 4, 2012, page 28)

In my spare time, I’m working with a friend on developing a course on translation redundancy and translation memories. To me, redundancy relates to the exact and fuzzy matches within and between texts. Below you’ll find a sneak peek of Chapter 2 of the course!

Forget everything you think you know

So your software provides you with figures that you think you understand, right? Let’s see about that.

Software applications calculate the number of exact matches and partial matches. The latter are also referred to as fuzzy matches. An exact match occurs when the wording is the same or, in some cases, when both the wording and the formatting are the same. As for fuzzy matches, different applications may have different approaches to punctuation, case, accents, etc. So why is it that match rates vary from one application to the next? It’s because there is no standard approach to redundancy. Below are some examples that will likely surprise you.

What match rate would you assign to the following sentences, and why?

  • Our country is beautiful.
  • Our country is big.
  • Our country is strong.
  • Our country is important.

To answer this question, I used two different translation memory applications. The first one calculated the match rate as 75%, the second as 80%. You may be wondering, why 80%? It’s because the second application does not assign the same value to all words—with good reason.

Do you think the latter approach is a good one?

That was actually a trick question, since the percentage doesn’t really tell you that much anyway. We can all agree that the hardest part is not translating “Our country is”; it’s choosing the right adjective. Also, one word in a sentence of four words is not necessarily considered 25% of the words in that sentence. You could choose to assign more or less value to certain words, but that still has nothing to do with the actual effort required to produce a translation.

It would be silly to conclude that the translator has only 25%—or worse, 20%—of the work left to do. If it’s true that not all words have the same value, then when a translation memory calculates the amount of effort saved by the matches, it should increase—not decrease—the percentage value of what remains to be translated.

Ill-defined calculations often make no sense to language professionals, but that’s nothing next to the fact that all sentences are considered equal by translation memories.

Which of the following match results do you think would save you more time?

  1. Government of Canada—10 times
  2. The user is responsible for the use of the usernames and passwords required by the application of services and for all direct and indirect activities enabled by these usernames and passwords.—once

Make sure you don’t include any time spent reworking a translation. I’m talking about just the time saved because the memory found a good translation. In the end, even though 100 hits on “Government of Canada” equals a “savings” of 300 words, the amount of effort saved by those hits is negligible compared with the amount of effort saved by the long sentence.

Since I imagine we all pretty much agree up to now, I would like to add fuel to the debate. While short strings of words such as “Government of Canada” are not a huge problem for translators, let’s see what happens with “corporate software,” a term that can be translated different ways depending on the context. I found several contexts and equivalents on federal government websites, including the following three:

  • logiciels du Ministère
  • logiciels d’affaires
  • logiciels

As it turns out, “corporate software” can be translated at least three different ways. To find the right translation for my text, I would have to read the surrounding paragraphs or pages. For titles and other short phrases, the amount of contextual reading required can easily exceed 1000% of the words in the segment to be translated. On the other hand, long sentences contain all the necessary context most of the time.

Experienced translation memory users are probably already familiar with the concept of an “in context exact match” (ICE match). It’s commonly believed that if you have the right software, you don’t need to do any contextual reading when ICE matches come up, since you know that what surrounds the match is exactly the same. That’s logical, right?

I bet you can guess where I’m going with this. That’s right: it’s logical in theory, but not so much in practice. An obvious cliché that I can’t resist using is that ICE matches are a slippery slope.

The fact that exactly the same sentence in the source language appears in a text that is almost the same does not necessarily mean that its translation has not changed, even if it’s a good indication that the translation will be the same. The following are some examples:

From French to English

  • Jane est une excellente travailleuse. Je l’adore.Remark a
  • Jane is an excellent worker. I love her!

Later on, in an almost identical text, we have the following to translate:

  • Jacques est un excellent travailleur. Je l’adore.
  • Jacques is an excellent worker. I love her!

Some people will be sure to point out that if the ICE feature is well designed, it will take the surrounding passages into account.

But does that necessarily mean that the gender will be right? Unfortunately not!

The fact of the matter is that adding one sentence specifying the gender of the person whose job is discussed in the next 25 pages of a text could require changes throughout the text. Imagine a text that instructs translators on how to use various translation memories. Then, one fine day, along comes someone who gets upset and decides that since this profession is dominated by women, the English examples should refer to a female translator.

It’s plausible, right? Obviously, when translating the English, you would have to change not only those passages, but also all the rest of the text. Also of interest, I’ve heard that in Italian bravo becomes brava if you’re addressing a woman, that in Portuguese you use either obrigado or obrigada depending on whether you’re thanking a man or a woman and that in Japanese the tone used when referring to oneself is different than the one used when referring to others, as more importance must be assigned to others than to oneself in a report.

As you can tell, part of what we thought we knew needs to be revised. At the Translation Bureau, we are looking for a way to measure effort that takes into account sentence length and many other related factors. Once we have that, we will make sure that such factors are clearly understood by all our colleagues.

Rechercher par thèmes connexes

Vous voulez en apprendre davantage sur un thème abordé dans cette page? Cliquez sur un lien ci-dessous pour voir toutes les pages du Portail linguistique du Canada portant sur le thème choisi. Les résultats de recherche s’afficheront dans le Navigateur linguistique.

Date de modification :