Some companies manage data defined and provided by a single, identified source outside the company, usually a standards organisation. These data, called Reference Data, are generally key to the company’s core activities. Most often, they are shared by manifold companies, systems, applications, and/or processes.

Typical examples of reference data are currencies and currency codes, defined by ISO, which are used for calculating prices or airport and city names, defined by IATA or IACO, widely used in the travel industry.

Users often claim these data are “translated”, a simple process for currency data: pound becomes livre, font, pundet, pond, punta, جنيه, 파운드 … By contrast, only very few cities, historically famous and geographically important places, translate into their own “proper” names: Αθήνα is Athens, Athinë, Ateena, Atina, Atenas, Ateena, Athen, Atene, 아테네, ათენში,雅典, أثينا…

For tens of thousands of cities, no “translations” exist. For instance, city names such as Tasiilaq, Thohoyandou, Szczuczyn, and Niquelândia are maintained in all languages, at least in ones employing Roman scripts. If, however, the data must be displayed in a non-Latin script, they have to be either transliterated or transcribed:

Both methods for converting text from one writing system to another are valid; the choice of method depends on the language pair. Transliteration is an orthographically accurate character mapping from one alphabet to another; properly done, it allows recovery of the original spelling. Transliteration makes sense between languages with alphabets of roughly similar size and concept. Transliteration, however, is impossible with a language like Chinese, given its thousands of characters and logograms. For such languages data must be transcribed. Transcription is a phonetically accurate mapping from one language to the best matching script of the target language. A well-known example is Pinyin, the official system to transcribe Chinese characters into Latin script.

Tasiilaq would be transcribed into Chinese as 塔西雷克 (Tă Xī Léi Kè in Pinyin) and transliterated in Russian as Тасиилак.