June 2019
Text by Allison Ferch

Allison Ferch is GALA's Executive Director. For more than nine years, she has served GALA in several roles and provided leadership for many of the association’s programs and activities. As Executive Director, she leads internal operations at GALA and is responsible for the growth of the association, its financial stability, and for the execution and expansion of programs and services to members.


www.linkedin.com/in/allisonferch
www.gala-global.org


 

The TAPICC Steering Committee includes David Filip (ADAPT Centre at Trinity College Dublin), Serge Gladkoff (Logrus Global) and Klaus Fleischmann (Kaleidoscope Gmbh). Working group leaders include Serge Gladkoff (Logrus Global), Jörgen Danielsen (Eule GmbH), Ján Husarčík (Akorbi), Alan Melby (Brigham Young University), Andreas Galambos (Transmission Uebersetzungen GmbH), Jim Compton (RWS Moravia), Ian Barrow (Conversis).

TAPICC and the language industry’s quest for interoperability

A collaborative, pre-standardization initiative strives to build consensus on use cases, metadata, and best practices for translation APIs.

For professionals handling multilingual content, the translation round-trip – or the transfer of data between various platforms and systems – is an important factor that can have a significant impact on project cost and time to completion. If you stop to think about the complexity of modern-day content management and the numerous types of systems involved, you can see how desirable interoperability is, and why it’s worthwhile to strive for a standardized approach to the translation round-trip.  

 

The problem

This common challenge has traditionally been solved by using API integration, or system-to-system communication that enables automation of repetitive tasks such as extraction and file transfer. The problem, however, is that there is an overabundance of customized APIs between a staggering number of systems and platforms. Customized API solutions require a commitment to a particular vendor or specific tech-stack, which are not easily adapted if either of those change. Furthermore, they require a great deal of resources in terms of IT, development, and of course, time and budget. Proprietary APIs hinder collaboration between industry players and make change difficult, which can stifle progress and innovation. It’s easy to see how the customized approach makes companies feel that they are stuck or are reinventing the wheel each time they forge a new connection. And the impact isn’t felt by just one party: clients feel it, translation vendors feel it, and tool developers feel it. It’s a ubiquitous and costly problem for those in the language industry.

Additionally, there is an untold impact on global customers. It’s well-understood that translating and localizing content provides a better user experience and can lead to customer acquisition, loyalty, and retention. But the cost and complexity of the translation and localization process, especially around system integration and automation, can mean companies forego those endeavors to save money and, in doing so, leave important content untranslated.

What can be done to improve matters? How can we eliminate loss of operational freedom, wasted resources, and untranslated content streams? A magic wand would give the industry a universal standard, similar to the Bluetooth standard for the transfer of data between wireless devices. But without a magic wand, the challenge must be tackled the old-fashioned way with hard work, reasoned debate between competing interests, and consensus-building among stakeholders.

The Translation API Cases and Classes (TAPICC) initiative was launched in 2017 to address this interoperability challenge. At the outset, it was established as a community-driven, open-source initiative to advance API standards for multilingual content delivery. It was designed as a pre-standardization project, its purpose to provide a metadata and API framework on which future integration, automation, and interoperability efforts can be based.

TAPICC aims to build a collection of use cases and best practices for all stakeholders and provide quickly implementable classes (code) for deployment. All these deliverables are meant to be donated to bona fide standards organizations for further development. Happily, TAPICC has been making excellent progress these past 18 months through its working groups, which are led by passionate volunteers committed to the common good and the advancement of the entire sector through open standards.

 

The TAPICC charter, governance, and deliverables

The TAPICC initiative is managed by the Globalization and Localization Association (GALA), a global industry association for the corporate language sector. GALA’s independent, nonprofit status and its global reach make it a trustworthy steward of industry development projects like TAPICC. To the extent possible, we have approached TAPICC in ways typical of open standards projects, including the development of a project charter, steering committee, and timelines and milestones. While GALA is not a standards organization (and does not intend to become one), we felt it was important that the work of the TAPICC initiative meet the rigorous expectations of standards bodies.

The stated goals of the initiative are to: 

 

  • Identify use cases for API standardization in the localization industry; classify and categorize these use cases and match them with any existing projects to provide a starting point for further standardization.
  • Harmonize the business metadata of existing standardization projects, extend them with community-driven input, and publish standardized sets of business metadata for future use.
  • Create API classes that can be implemented by clients, LSPs, and technology vendors alike.

TAPICC was set up as a collaborative community project on the GALA platform with the understanding that consensus-building would be critical. It openly encourages participation by all community stakeholders including language service providers, translation and localization buyers, language technology vendors, and researchers. GALA membership is not a requirement for participation, and anyone is welcome to join the working groups. All contributions are governed by well-regarded open source legal agreements (BSD-3 Clause and CC-BY 2.0) and works-in-progress and deliverables are publicly available on platforms like Google Docs and GitHub.

The initiative is guided by an experienced steering committee of forward-looking, proactive practitioners setting project priorities and defining deliverables. GALA serves as the platform for collaboration, project promotion, and dissemination of information. TAPICC has a few dozen committed contributors and the support of more than 275 volunteers who follow its progress with interest.

The project follows four main standardization tracks, including:

  1. Supply chain automation: Enable the exchange of business metadata, localizable content, and automation-relevant functionality between content repositories and localization environments
  2. Transfer of localizable content on a segment or unit level: Enable the handing over of segments and units within the localization process, for instance between different localization tools
  3. Enrichment of localizable content: Enable the markup of localizable content with linguistic information such as TM matches, MT output, terminology markup, "good enough" layout markup
  4. In-layout translation: Enable an abstract layout representation in localization content to allow the translation and review process to happen in-layout, independent of the content source

To date, the work of the volunteers has focused on Track 1, supply chain automation. Track 2, transfer of localizable content on a segment or unit level, is about to commence.

Track 1 progress and deliverables

Track 1 was divided into four working groups at the outset: business metadata, payload specification, XLIFF extraction, and API specification. There were significant interaction and dependency between the groups, plus no small amount of disagreement and debate, as they sought to develop consensus.  

Working Group 1 (WG1) took on the challenge of defining task types that are common in translation and localization projects. They developed a multi-tier task type typology and generalized eleven tasks that comprise any and all content-related tasks in translation workflows. Those main tasks were further broken down into subtasks that can be used when more details are known about the project in question. The typology developed by WG1 was used in Working Group 4 (WG4) to define the API, and it was generally agreed that high-level tasks will have fewer details and be less defined than the granular, concrete subtasks. With this metadata model, only a limited number of "task element types" is needed to build just about any workflow with any level of detail.

Working Group 2 (WG2) was mandated to specify what kind of payloads will be exchanged via the API. This group produced critical consensus on the payload being driven by the task type defined within WG1. Importantly, they concluded that all bilingual linguistic tasks such as translation and review must be based on exchanging the open transparent bitext format XLIFF 2. Further, they determined that standalone terminology should be exchanged as TBX Basic (compliant with ISO 30042:2019 a/k/a TBX Version 3, or 2nd ISO edition). Terminology accompanying translation and revision tasks must be either XLIFF Glossary Module-based or TBX Basic-based.

Working Group 3 (WG3) explored the best practices for producing XLIFF, i.e. the payload type critical for core TAPICC use cases. This group produced the first formal TAPICC deliverable: TAPICC XLIFF 2 Extraction and Merging Best Practice. This deliverable was contributed by GALA to OASIS XLIFF TC, which pledged to re-release it as a Technical Committee Note and maintain it. That contribution marked a key accomplishment of the TAPICC project.

The mission of Working Group 4 (WG4) was to "tie it all together" – the edicts, recommendations, etc. from the other working groups – into a RESTful API specification that supports standardized system-to-system transactions. The deliverables include a Swagger file and a reference guide to support developers who endeavor to create so-called TAPICC-compliant implementations.

The body of work produced in Track 1 represents hundreds of hours of collaboration among volunteers spanning the globe. Working through different points of view across a diversity of use cases is no small endeavor and we take off our hats to the people who have contributed so much to the project thus far.

 

What’s next?

At GALA’s annual conference in Munich (March 24-27, 2019), there were two pre-conference TAPICC workshops. One focused on the output of WG4, the TAPICC RESTful API specification, and addressed how to create a TAPICC-compliant "host" system, or how to make your current system work with a TAPICC-compliant host. The second workshop focused on the upcoming work of Track 2 and introduced the concept of real-time exchange. It explored the differences between real-time and asynchronous exchange for supply-chain automation and examined related business cases. Participants also discussed JLIFF as the standard for the real-time exchange, which is a natural continuation of the consensus achieved by the Track 1 WG2, to use XLIFF 2 as the bitext format. The rest of the workshop was dedicated to discussing a fork of the WG4 API, focusing on the proposal for real-time exchange. It marked the official launch of Track 2.

GALA is currently recruiting volunteers to participate in Track 2, and we encourage anyone with a stake in this aspect of multilingual communication to get involved. Additionally, GALA is preparing a few TAPICC presentations for upcoming conferences in 2019, to add to the more than a dozen presentations it has made on TAPICC these past 18 months in an effort to evangelize the project and garner support and engagement.

 

Challenges and obstacles

We’re delighted with the progress made by the initiative thus far, but it is not without its challenges and obstacles. As with any volunteer-led project, delays are inevitable owing to the limited bandwidth of volunteers, who have regular jobs and obligations to attend to. Planning meetings with busy professionals across time zones is always going to be a challenge, but it is critical to the progress of the working groups. (How far they have managed to come already is a credit to their commitment and passion for solving this challenge.) Furthermore, it’s important for more than one person to take the helm of working groups to allow for the inevitable flux of focus and availability of each volunteer leader. However, finding volunteers willing to commit at this level is not easy.

With many initiatives, especially those related to standards, there can be significant "creep" in the mission or focus. The scope can expand or diverge based on the interests and priorities of those involved. With the TAPICC initiative, we’ve been very clear that we’re not trying to boil the ocean and have stayed true to the well-defined goals set at the outset, in spite of many lively and passionate debates. But vigilance is important, and leadership and volunteers must remain mindful of the scope and aims of the project, even in the face of competing demands.

Another significant challenge for TAPICC is stakeholder representation. To date, the vast majority of the initiative’s volunteers come from language service provider companies and the occasional localization buyer. The language technology developers have mostly camped on the sidelines, watching to see how the project develops. We’d like to see more balanced participation from stakeholder groups, which can only lead to better insight, better debate, and ultimately better buy-in.

 

An open invitation

As a well-balanced representation of stakeholders is critical to the project’s success, we invite all types of language industry professionals to engage in TAPICC. Content publishers, system integrators, language service providers, localization technology developers, multilingual content and process architects… all are needed to advance the initiative. Your experience matters, your opinion matters, and your input is helpful, no matter how small. To get involved, please visit www.gala-global.org/tapicc. Join a discussion group, attend meetings, and advocate internally for open standards instead of closed systems. Your opinion on the Track 1 deliverables is also important, and you’re invited to share feedback. Start your TAPICC journey on the GALA website, and feel free to reach out to any GALA staff member or any TAPICC volunteer leader for more information.