January 2017
By Wojciech Froelich

Image: © Ivaylo Sarayski/123rf.com

Wojciech Froelich has 15 years' experience in localization engineering. He is CTO at Argos Multilingual where he leads a team of experienced engineers who are responsible for building customer-oriented localization workflows and providing internationalization consultancy and software engineering suport to Argos partners. He focuses on integrating authoring, automated translation management and multilingual publishing systems.


info[at]argosmultilingual.com
www.argosmultilingual.com


 


 

Localizability and world-readiness for software

Software errors found after the product has been released can cost four to five times as much to fix as the ones uncovered during the design stage. And, for localized products, these costs can be multiplied by the number of target languages. Incorporating localization steps early can reduce costs and save precious nerves.

When a software development team is in the product design phase, it is easy for localization to be overlooked. Development teams are busy gathering user requirements, developing components and testing usability in the source language (most often English) – until the moment when the product manager says:

"OK, this is great! Let’s sell it in 15 countries."

If localizability hasn’t been built into the product, developers can be hurled into a costly and time-consuming spiral of retrofitting global needs onto a product, delaying global release cycles and creating inelegant, provisional workarounds that add unnecessary complication to both native and international release processes. What seemed like a great idea – to amortize development costs more quickly through global expansion – can rapidly become the source of sleepless nights for both the product manager and the development team.

The ultimate goal of software development for the global marketplace should be to create a "world-ready" application. This process is often referred to as internationalization, and ensures that software can be run anywhere in any language. This means the application should be able to support any locale, any regional time, date or numbering formats, as well as ensuring that the application is ready to be translated into any language, even those using different scripts and alphabets or bi-directional languages such as Hebrew or Arabic.

The key to localizability is the design of software and resources that enable a software localization process that does not require re-engineering or modifying code after translation processes have been completed.

The first step on the road to a "world-ready" application is achieving the clear separation of User Interface (UI) resources from functionality-related resources. The ideal scenario is to achieve this separation using file formats that suit localization teams – enabling both the use of professional translation tools and potential automation, which becomes a key element when developing and localizing in agile environments.

The second step is to be able to measure the success of any localizability efforts. The easiest way to confirm that a product can be easily localized is by running localizability testing tasks.

Function and process

There are typically three interdependent functional teams involved in launching a global product: developers, translators and testers.

As the process flow in Figure 1 illustrates, usability testing takes place prior to the launch of software in its original language, but also after translation. Localization testing can result in two types of issues:

  • Translation issues can be fixed by getting the linguist to amend the translation. This type of issue usually occurs either because the original translation does not reflect a string’s real meaning in the context of the running application or a string needs to be shortened to match the space available within the User Interface.
  • Functional issues are issues that need to be routed back to developers that cannot be fixed by making a linguistic change. These can range from bugs introduced due to the translated language to string length issues, which cannot be fixed by reducing the length of a translation or functional issues not apparent in the English build.

Figure 1: A typical software release process
Source: Argos Multilingual

 

Context is king

It might seem self-evident, but the key to a successfully localized software product is to enable linguists to do the best possible job.

Professional linguists are usually highly qualified: In addition to a degree in translation, they often have many years of professional experience before they specialize in a particular area. But even the best linguists need to have a solid understanding of the original product if they are to produce a great translation.

Reducing the number of functionality issues relating to world-readiness is one of the factors that will allow linguists to get the translation right the first time.

When it comes to giving linguists what they need to do the job correctly, context is the single biggest factor. In software localization, context is king. Provide as much context information as possible, and your product will reap the benefits.

The perfect scenario is a full visualization of the UI during translation, but other elements can also be very helpful:

  • Meaningful string identifiers can provide additional context information.
  • Comments from developers can be a great source of information, especially for cryptic UI messages.
  • Length limitation information. Note that a “number of characters” limitation will work only with fixed-width fonts. When using proportional fonts, the linguist will need to know the actual width of a string in pixels compared with the length limitation.

In cases where localizability has not been built into the development process and where the product manager decides that the product should be sold globally, a developer is often quickly reassigned to produce "translatable" resources. In practice, these developers are very often tempted by easy-looking formats like Excel files, where linguists are expected to insert translations into their language’s column parallel to the English string. Once translation is completed, a developer then needs to copy and paste the translations back into the code. It is worth mentioning that it is rare that these kinds of hurried copy-and-paste operations in an unfamiliar language are done right the first time around!

Figure 2 is an example of such a translatable file in Excel format.

Figure 2: Example of a UI translation file in Excel format
Source: Argos Multilingual

 

As you can see, the developer has put a great deal of effort into creating the file and has included useful information such as where the string is found and the maximum string lengths. No doubt the developer is very proud of the work and believes that the file will be a great help for translation teams.

Although the file might look cumbersome, the developer has actually done a decent job in this example; it is significantly better than many localization files received even ten years ago!

In reality, though, these formats are not easy to process in a translation workflow, and it is worth getting developers and localization engineers together in order to discuss available options. In most situations, there are better ways to handle the UI files. These days, linguistic technologies are far more advanced and a little forethought can result in a much smoother localization process.

Where should you start? Usually, the best place to begin is the software development framework being used.

Most modern software development frameworks provide tools to extract UI content from code and guidelines on how to create code that can be easily processed for localization. These tools produce files in formats supported by translation tools like .resx, .properties, .po and .ts files.

Processing UI content with translation tools gives linguists access to many advanced features like translation memory, terminology databases and automated quality checks that are right at the translator’s fingertips as part of the translation environment. These features help speed up the production of high-quality translations.

Some frameworks even enable linguists to preview UI while translating. This is usually the best possible scenario – translators see the translation immediately in the right context, they see what type of element is being translated (button, menu item, text field label), and they can see the potential space limitations.

Unified formats for UI translation are a key element to automation of hand-off/hand-back processes in agile workflows. Without automation, agile localization processes can’t integrate with agile development processes, and there is no automation if formats are not clearly defined.

Figure 3 shows a preview generated from an .resx Windows Form by a popular translation tool. As you can see, there is plenty of useful information available for linguists, including the string ID, development comments as well as a preview of the dialog, which will help linguists to understand how a dialog is used, and whether there are any string length limitations.

Figure 3: A Windows Form preview generated from .resx resources
Source: Argos Multilingual

 

Even custom XML files can be previewed. Because XML files can be easily processed for translation, additional information such as comments about usage or context can also be included for linguists, giving them the best possible chance of getting translations right the first time.

A proactive localizability development model

The problem with incorporating localization late in the development process is that it complicates the process of resolving language-related issues discovered during or after translation, and during localization testing. These issues will probably require some development rework and are likely to happen at a very critical moment in the timeline – right before the release. A study by the Systems Science Institute at IBM found that the "cost to fix an error found after product release was four to five times as much as one uncovered during design."

For localized products, these costs have the potential to be multiplied by the number of target languages.

If localizability testing isn’t carried out before English approval, each testing team (for the respective target languages) will spend time discovering and reporting the same issues, and will then each have to run regression testing on the same bugs. This can be avoided if localizability testing is brought forward in the product development cycle.

If we modify the earlier process a little – by running localizability testing in sync with usability testing – we can avoid these issues altogether. Figure 4 shows a revised process flow with the inclusion of pseudo-translation testing.

Figure 4: A software release process adapted for localization
Source: Argos Multilingual

 

This means developers will exchange localizable files with the translation team even before source content is approved.

Introducing pseudo-translation testing helps to identify language-related issues early in the process, so there is still enough time to go back to developers and fix any language-related issues.

Once the source content is approved, it can be translated and tested linguistically, knowing well that the number of issues requiring developer intervention has been minimized because they were already discovered and fixed much earlier in the process.

Pseudo-localization

So what is pseudo-localization? It is a very rapid and cost-effective method of mimicking the effects of the translation process without the costs and scheduling impact of a real translation.

Pseudo-localization simulates the translation of all resource strings, and usually includes:

  • Replacing some English characters with similar non-English characters.
  • Adding extra characters to simulate text expansion.
  • Adding prefixes and suffixes to mark the boundaries of a string.
  • Using Unicode characters for all resources.

All these changes can be automated, so all UI resources can be modified, using similar patterns that help to identify issues quickly in newly built resources.

Pseudo-translated resources can be used to rebuild an application and test it. It should function in the same way as the original version.

Thanks to the patterns used while pseudo-translating resources, it is easy to detect the following issues:

  • Strings that were hard-coded and were not moved to the localizable resources.
  • Strings that were built from other strings by concatenation or replacing some parts of the string – these strings will often fail to work in languages different from the source language.
  • Any Unicode-incompatible functions that process or display text.
  • Any other display issues like incompatible fonts, truncated strings or unexpected space limitations.

Code reviews and lessons learned

Just as complex projects should be followed by a post-project review, it is worth closing the development loop by creating and maintaining code development guidelines that include localization-related guidelines. These guidelines should reflect results of code audits and localization process issues that were fixed in the past. For projects with large volumes of code, there are third-party tools that can automate the code review process. What’s more, localization providers will normally be happy to participate in this process too, as it will help them to perform great work the first time around in future projects.

In a nutshell, world-readiness is perfectly compatible with the development process as long as you take steps to bring your localization partners on board early, perform pseudo-translations to test localizability before translations begin, and consider the best format to allow linguists to perform their job.

All this means that everyone can enjoy a peaceful night’s sleep when your product manager asks to sell in multiple locales. You will be able to answer with confidence: "Sure. No problem, let’s do it!"