There’s a fact that software developers have grown to accept: to reach your global market penetration goals for a new software package, you have to localize it, whether it is an app for mobile devices or an ERP. Localization should be integrated from day one in the software development process. When localization is overlooked, projects that are straightforward at first glance can turn  into nightmares. Here are some guidelines to spare you some bad surprises and unnecessary hold-ups.

The do’s

1. Consider your target languages when selecting your character encoding format

UTF-8 is the best compromise between character coverage and computer ROM requirements. It can support scripts in literally any language, leaving out only a handful of regional ones.

ANSI encoding does not support the Arabic characters:

150512_arabic ANSI encoding incompatibility


2. Allow for text string boxes to expand and shrink

Design your software to allow text string reflow, window reorganizing and resizing. Be especially careful about how text is displayed around graphic elements. Separate presentation and content, so that font sizes, line heights, etc. can be easily adapted for translated text. We recommend you pseudo-translate the software strings, using the target characters and factoring in a multiplier that simulates the expansion or contraction of the text strings. For example Thai letters are taller than English ones and Italian words are longer than their English translation.

The example below illustrates the jump  in string length from English to Portuguese, which required moving the field entry boxes. A glitch in the text box default alignment forced the strings to be approximately aligned to the right. We ended up correcting the misalignment manually – and with mixed results.
text box alignment EN_SPBR

3. Remember hot keys and keyboard shortcuts also get translated

Translators adapt them to the task in the target language and each of the 26 latin letters can only be used once in each language.

4. Think locale rather than language

Formatting of times, dates, numbers and names differs according to the language. In the US the month precedes the date; Islamic calendars use different references altogether. Most Continental Europe countries format numbers with spaces between thousands and a comma as the decimal separator. Others use a comma between the thousands and a period as the decimal separator (among which Norway, Spain, the US and the UK). As for user information, prefer the terms “given name/family name” to “first name/last name”; most Asian languages position the last name first. The input fields should be enabled to switch direction and support bi-directional text. A day is measured in 24 hours in some languages, in others 12 am and 12 pm hours.

5. Choose English as your source language

You can elude Shakespeare’s language if you translate into a neighboring language (for example Japanese to Chinese or French to German). If you plan on  intercontinental localization, best write in English as your source language because the wide selection of translators keeps the rates down. Alternately you can first translate into English, making it your pivot language, and from there translate into your other languages. With this intermediary step you run the risk of reducing translation quality. We translated a point of sale piece of software from French into English as a pivot language, then into twelve other languages. The French version contained two distinct expressions, “escompte” (discount for payment in cash) and “remise” (generic discount). In English and the twelve consequent languages, the two meanings merged and became “discount”.

6. Test early and often

In most instances translators do not localize software interfaces in context, so they are not aware that their text is too long and pushes the placeholders, generates confusion or corrupts the software functionalities. Testing alone will make bugs stand out and allows corrective measures to be taken ahead of the release date.


The Don’ts

1. Don’t use the source string as string ID

If you need to update your software, you will lose the link to the translated strings in each language and will have to manually update each language. Instead, remove all the strings that the user can see from the hard code and replace them with character encoding (the UTF-8 format we talked about earlier). This includes strings in alt tabs, images, error messages and contextual pop-up help text. You can test that you have extracted all the hard-coded strings by replacing the letters in the text string with the letter A for example. The text that does not read as “AAAAAAAA” once you build the software has not been defined in the string tables.

2. Don’t use the same string ID to express different things

On one of our software localization projects, we came across the English terms “from” and “to”, used to express location, time and the transfer of data from a person to another. In German, the three meanings are translated von…zu or von…bis. As the translator did not see the source string in context, he was unable to choose between the two.

3. Don’t concatenate strings

Sentences are built differently according to the language, so putting two variables side by side and in a fixed order can result in clumsy sentence structures.

4. Don’t allow gender or numbers inside variables

In this instance the variables “pièces”, “configuration” and “zone” are feminine in French and should have been associated with “une” and “la”.

gender mistakes software localization

5. Don’t insert acronyms

We all love acronyms because they are short and sweet and replace long techie expressions that burden a sentence. Yet in most Asian languages, acronyms don’t exist because words aren’t made of letters but ideograms.

This list shows you how complex software localization is, and the many traps you can easily fall into if you don’t foresee the consequences of choosing word A instead of word B. Getting advice from your localization partner as you start writing the software code is a way to help you stay on track. Which problems have generated the worst headaches in your software localization projects?