Tips on Providing Context for Translation
If you have been involved in localization for any length of time then you have likely felt the pain and cost of having to go back and correct translations in multiple languages because some word or phrase was not clear in the source text. In this blog post we are going to explore context in translation.
Usually, a mistake is discovered in QA or reported by a user, and then you must track down the exact string and work with your vendors to update it with the correct translation. Chances are if the source was misunderstood in one language then it was also misunderstood in other localizations, so you should probably check those too. This is normal and common due to the ambiguity and dynamism of human language. In short, people are always making up new words and it’s not always clear what even the old ones mean without some “context”.
Again, by now you have certainly heard the issue of context come up from your translation provider. Context is a big topic, but let’s simply define it as the surrounding environment and background of communication that enables you to understand the full meaning of a message. When someone talking to you uses the pronoun “she”, the only reason you know who “she” is is that you have been listening to the conversation and remember from a previous statement who is being referenced.
Many errors in translation of app strings are due to lack of context
The context problem is inherent to the app localization process. There are two basic models for managing translatable strings in apps. Either the developers extract the text strings into resource files where they reside and the app references them using keys, or the strings continue to live within the code or a database and are collected into resource files only when it is time for localization. There are a host of good reasons why this is the case, but the upshot is that either way what is sent out for translation is a file of individual strings almost completely removed from the surrounding context that helps explain the intended meaning. It is like taking the “she” statement from above, separating it from the rest of the conversation, and handing it to a third party and expecting them to know who “she” is too.
In other words, your product managers, developers, technical writers, user experience specialists, marketing team, and everyone else has put a lot of effort into researching and crafting a focused and engaging brand message and experience to drive conversions and traffic in your home market. Now, as you plan to go global and repeat that success in foreign markets, we’re going to blow apart that message into disaggregated, atomized strings in a plain text file and expect our translators to recreate the message from the rubble and adapt it to their local market. This is, shall we say, a big ask.
The solution is to provide crucial context to your translation supplier from step one and to do so using an advanced translation management system (TMS) that ensures shared access in the cloud to the up-to-date contextual information for repeatable reuse during every iteration. Let’s leave the tech aside for the moment. It is possible to manage this with a melange of offline spreadsheets and documents; it is just much more error-prone and less efficient (narrator voice: but really, do the tech. It’s totally worth it).
The three basic types of problems that arise from a lack of context
You and your localization team live and breathe this content every day and know what it means, but in almost all cases that is not true of your translation supplier. Most context issues can be boiled down into three types:
- “Which of the several possible meanings of this word or phrase do you mean?” Among the human languages, English is notorious for “homographs” (i.e. words that are spelled the same but mean different things) and other types of ambiguity. If you look up a word like “fly” in the dictionary you will see that it could be either a noun or a verb and that each of those can have multiple additional “senses”, or different possible meanings. Absent the surrounding context of a string, it may not be clear to your translator.
- “Which usage of this term or phrase is intended in this specific instance?” Even within the known and validated meanings of your strings there will often be multiple choices for a word or phrase. And the correct meaning will depend on where and how the term is used in your user interface. But when your supplier is translating only a text-based strings file, they may not have access to your user interface. It may not even be released yet.
- “What on earth does this even mean lol?” If your ambitions include innovation, or especially if you are a game developer then you are most likely making up new words that no one has seen before! There is no way you can expect your translators to know what these mean unless you tell them. The name you give to your innovative tech, your cleverly crafted brand messaging, your in-world characters, places, and objects, etc. All of these will be new to your translators as well.
Let’s look at a quick example of an actual UI label from our own app, Transifex. You don’t even have to translate it, just try to figure out what it means without any of the surrounding context:
“Translation Memory Fill-up”
Got it? Well if you’re a user of Transifex then I sure hope you know what this means and are taking full advantage of it. But if you are new to either Transifex or localization, in general, it may not be so clear:
- What is “Translation Memory”? It looks like a specialized term for a technology, but what does it mean? Does it already have an established translation?
- What is being filled-up? Is the Translation Memory being filled up with something, or is the Translation Memory filling something else? Subject versus object, which can often be unclear in English, can require very different translations in various languages.
Let’s see how a little bit of added context helps us understand the true meaning of this phrase. The screenshot of the UI below gives the translator several clues:
- The subheading “Pre-translation”, indicating that this phrase is related to the automatic insertion of translations before being assigned to a translator. So probably it refers to something [ed: this is another specialized localization term, but for this example let’s assume you already knew what this one meant.]
- A check-box next to the label. This looks like some kind of configurable setting. Makes sense for a translation tool.
- The detailed description that tells you exactly what this setting option is all about.
Now, thanks to the visual sample of how the phrase is used in the context of the UI, the translator can correctly translate this setting label on the first try. Without this vital context, the translator might guess wrong and you would not discover the incorrect translation perhaps until after release. Then it must be reported (possibly by your customer!), tracked down, fixed in every language, and re-released, all of which costs time and money and degraded user experience.
Best practices for providing context to ensure translation quality
With the example above we can see that visual context can be essential to understanding the meaning of UI strings. The reason is that even though in most cases these three strings will appear in sequence in the resource file, there would be no indication that the string “Pre-translation” is a subheading or even that these strings are necessarily related to one another when they are nested in a long list of other strings. Like so:
038 Previous string 039 Pre-translation 040 Translation Memory Fill-up 041 Checking this option will automatically translate phrases with exact matches from the Translation Memory. 042 Next string
Of course, we know what we’re looking at by now, but you can see how just the string list in plain text is somewhat less clear without the added context.
Happily, advanced translation management tools like Transifex offer a number of ways to provide context to your translators directly linked to strings while they are working:
- You can map screenshots to strings.
- If your resource files contain comments or context fields, this data can be imported and shown to the translator as metadata on the string. Sometimes a screenshot alone is not enough context. My example above happened to contain a definition in it, but many don’t for buttons or menu items, etc. So text descriptions are very helpful to translators.
- You can also manually add comments to strings in the editor when none are imported from the source file. These comments remain linked to that string throughout your iterations.
- For website translations, you can add a domain to your project, and with the simple addition of a javascript snippet to your pages, Transifex will scan them to match the strings in the project. Translators will see a list of URLs where the string occurs, and clicking on the link shows the page in context while highlighting the selected string.
All together Transifex enables you to centralize up-to-date context information in the cloud (no versioning problems) and link it directly to strings so that translators do not have to hunt for it. The context information stays in the project through your iterations so the translators always have it at their fingertips.
Even if you are not ready to invest in a proper TMS (do it!), then there are still things you can do to help provide context without major changes to your codebase. If you are planning to localize your app then you really should be preparing to facilitate translation from the earliest possible stage. This includes how you structure your data so that it can be easily managed and translated, a process known as internationalization. Here are three simple tips:
- Separate resource files by component or module. One of the challenges of correcting translation context issues is locating the precise string that needs to be updated. E.g. which of the 35 copies of “OK” is the one that needs a different translation based on its specific usage? When QA or a user reports an issue then it is easier for your translators to find the string if they are organized according to the structure of your app.
- Use descriptive string IDs. String keys come in many forms, often simple alphanumeric strings which offer no useful information to a translator or, worst of all, exact copies of the string text. In Transifex the string IDs are visible to the translator, so in addition to helping locate exact strings, they also present an opportunity to give some useful context to the translator about the string. You can compose your own, but a namespace structure is especially helpful to translators.
- Insert comments in the resource file. As mentioned above, Transifex will import string comments from compatible formats and display them to the translator. These are your strings. You wrote them so you are the one person who knows best what they mean. Explain them upfront so that your localization team will not be constantly asking you what they mean.
These best practices are all easily within your power to do without big investments. As an illustration of their value, look at this rearrangement of our sample resource file from before. Instead of an undifferentiated list of strings in one long file, you can see how the use of the file name, string ID, and comments can be used to add some contextual meaning to the strings to help the translators find and understand strings:
[Project_Settings.properties] --------------------------------------------------------- /* subheading for pre-translation settings */ "settings_section_subhead" = "Pre-translation"; /* checkbox to turn TM fill-up on or off */ "settings_option_TMfillup" = "Translation Memory Fill-up"; /* description of TM fill-up setting */ "settings_option_TMfillup_description" = "Checking this option will automatically translate phrases with exact matches from the Translation Memory."; ---------------------------------------------------------
Help them help you
I hope I have clearly demonstrated how understanding the intended meaning of text can be difficult without context, especially for app strings. These best practices will enable you to provide your translators with this vital context from the beginning. Even though a streamlined continuous localization process allows you to release localizations and then quickly update corrections, it’s always better to give your users the best possible experience the first time. Detecting issues, reporting, finding, correcting, and re-releasing updates wastes time and money that is better spent on new features to delight your users and grow your business by going global.