March 2019
Text by Krissy Welle

Language technology to the rescue

Katů Speak is a new translation memory tool that was created to enable communication in a humanitarian crisis. But its potential could reach well into the commercial market.

In some parts of the world, a voice-activated home device can play your favorite song or teach you how to bake a cake. Yet, in those parts of the world that are experiencing humanitarian crisis, people often arenít able to read instructions to attend to their basic needs, even if they are written in their own language. And they certainly canít ask Alexa to help.

In these situations, speech technology is more than a luxury. Quite often, people affected by a humanitarian crisis are less literate. Women in particular are less likely to be able to read. In such scenarios, having information in an audio format is a need, not a want.

Of course, providing information in an audio format is not enough to ensure understanding. Language also needs to be considered. Yet, to complicate matters further, people living through a humanitarian crisis often do not speak a language blessed with extensive resources. The languages they speak may not have many trained interpreters, any translation memory, or even a universally accepted script. Speech-to-text or text-to-speech solutions donít exist Ė in many cases there isnít enough data to build such technology.

This is the situation that my organization, Translators without Borders (TWB), deals with on a daily basis. Due to this predicament of human need and lack of technology, we have developed the first voice translation memory system, Katů Speak.

Katů Speak is a critical development that can help humanitarian organizations communicate with people living through unimaginable disasters. However, the benefits of the budding technology extend beyond the humanitarian sphere, pushing the boundaries of language technology.

Rather than trying to fit commercially developed solutions to the needs of people in crisis, the development of this new technology has reversed the process: The basic human needs of those who speak marginalized languages are the impetus behind Katů Speak.


Developed for a crisis

Since August 2017, more than 700,000 Rohingya people have crossed the border from Myanmar to Bangladesh. They are fleeing persecution and violence in a country where they are subjected to discrimination as an ethnic minority.

International humanitarian organizations mobilized to help the Bangladesh government respond to the needs of the new arrivals. TWB arrived on the scene in 2017 to address the language issue and quickly learned that this was one of the most complicated linguistic situations we had experienced.

Five languages were spoken in the response: Rohingya, Bangla, Burmese, Chittagonian, and English. However, Rohingya was the only language that was accepted universally and that all refugees understood. Yet, the resources for the Rohingya language were limited: There arenít many trained interpreters or translators, many Rohingya are illiterate and, to further complicate matters, there is no agreed-upon, standardized script for the language. Audio messaging was the best solution, yet an efficient, scalable translation solution did not exist.

As TWB began working with humanitarian organizations to support the refugees, we recognized the need to create content in an audio format and to streamline and optimize the process for doing so. Working with on-site responders as well as volunteer translators, we began recording spoken translations to supplement written translations. In the process, we generated data to fuel a new and unique kind of translation memory system.

The resulting technology, Katů Speak, is the first voice translation memory system of its kind, creating an easy way to generate audio communication in underserved languages. It integrates with TWBís own Katů 2.0 translation management environment, and allows responders to record and reuse spoken information that has already been translated into text.

The recording process is simple, allowing contributors to submit recordings using a simple web link. Anonymous data is collected including age, gender, and, if available, the geo-located accent of the speaker. This data helps match the speaker to the audience. On some subjects, such as reproductive health, for example, women may be more receptive to a message spoken in a female voice. And, as the recorded dataset grows, the search function can retrieve matching recordings based on text and metadata. This streamlines the process for subsequent translations, helping humanitarians to respond more quickly and with greater confidence.

The resulting speech translations can be downloaded as simple MP3 files that contain all relevant data, including the source text. In a humanitarian setting, aid workers can play these recorded translations easily over loudspeakers, radios, or on their phones.

These standardized recordings are especially important in a humanitarian context. Refugees often encounter aid workers associated with many different organizations. Using a translation memory system that can provide consistent voice translations in native languages such as Rohingya ensures that responders are more confident in the translations and information they are providing. It also makes sure that indigenous people arenít hearing conflicting translations that confuse the message.

Ultimately, Katů Speak addresses a critical human need. Yet the impact of this new technology can extend beyond the humanitarian world, helping other language technologists advance their text-to-speech and speech-to-text capabilities, especially as more data is collected in marginalized languages.


Implications beyond the humanitarian world

As language technology continues to evolve and adapt, every development can contribute to the whole industry. In fact, Katů Speak was recently recognized as top "invader" technology at the TAUS Game Changer Innovation Awards, demonstrating the industryís recognition of the approach.

We will continue to scale this technology, working with our communities of translators to increase the data collection in more languages. We certainly welcome your help as we do so. Katů Speak is one of the first outcomes of TWBís Gamayun language equality initiative, which develops language data and technology for marginalized languages. The data generated through Gamayun will be available for others to use. This will help technologists build tools and reduce the gap in knowledge between those speaking more technologically developed languages and those with restricted access to information because of the language they speak.

The first voice translation memory system was created as a result of the challenges posed by a humanitarian crisis. Technology does not always develop solely as a result of inquisitive minds and massive funding. It is developed to solve human problems, and the solutions have far-reaching implications, not only for technology and communication but also for human lives.


Image: A TWB trainer conducts comprehension research in northeast Nigeria.
© Eric DeLuca, TWB