[Post 5 in the DITA Loc Wire series]

DITA is a framework for content re-use across multiple information channels.  This framework relies on separating content and format, as well as semantic tagging.

Among the options a technical writer can use, many of them require consistency to ensure an optimal user experience:

  • Consistent UI terms in the software displays and the documentation
  • Consistent index terms to avoid conflicting entries in the automatically generated index table
  • Exact document names or references when referencing external documents
  • Consistent wording between a link (internal or external) and the content displayed when the link is followed
  • Consistent wording between the glossary and the terms highlighted in the content when a glossary is included
  • Consistent wording between the Bill Of Materials and the tooling, part numbers, and reagents. This can lead to providing a direct link from the documentation to the order form.

Consistency is a subtle mix of ingredients

To build consistency in the source content (English in most instances), the technical writer can rely on his product expertise. When writers work in teams, style guides and glossaries are invaluable to ensure consistency between the team members. DITA architecture can also automate consistency, using content inclusion and keys. However, this might lead to extraneous content complexity and content localization issues (see post #2).

How do you ensure consistency in translation?

When DITA content is translated, typically into 10 to 20 languages, the localization process provides tools to prevent consistency issues. The partner’s product expertise is not sufficient, even when the translation is carried out internally. You need to translate your style guide and glossary (also called term base by translation providers) in each language.
A translator refers to the term base when a tagged term comes up. If the term base is not complete and the term cannot be found, the translator has two options:

  • Send a query to the project manager who will send it to the client’s expert to answer. In that case, the query process can be extremely heavy, especially if there are 20 languages.
  • Make an educated guess, relying on concordance searches in the translation memory. The consequence will be loss of productivity, or worse, inconsistencies.

We recommend preprocessing the keywords

We recommend extracting from the DITA corpus all the keywords surrounded by specific tags such as uicontrol, wintitle, cite, xref, userinput, systemoutput, keyword, glossterm, or indexterm. After that, you check that all the terms are actually included in the term bases in each language.

This process can be fully automatic with proper scripts in the preparation phase and run only once for every language. It might also allow to spot English source inconsistencies or mistakes. Once complete, the translation process will be very smooth and ensure high consistency and productivity.

Another option is to centralize some of these keywords in reference tables and import them into the rest of the content using Keyref, conref or conkeyref but this might have significant linguistic implications (see post #2)