A Brief Introduction to Translation Technology Standards, Tools & Best Practices

A Brief Introduction to Translation Technology Standards, Tools & Best Practices

Published in: Translation technology
by Adrien Mathot, Translation Technologist @cApStAn

First of all what is a standard? According to the European Committee for Electrotechnical Standardization (CENELEC) standards “provide people and organizations with a basis for mutual understanding, and are used as tools to facilitate communication, measurement, commerce and manufacturing.”

What happens when things are not standardised

Or are not following the same standard from place to place. Let’s look at some things that are not standardised: you are just out of a 12 hours trip from Belgium to the United States. You are (finally) in your hotel room but are tired, jet lagged and want to eat something before going to sleep. But one last step awaits: you need to tell your family and loved ones you are safe across the Atlantic. No luck — your phone is out of battery, you take out your phone charger and realise it won’t fit in the wall plugs. You dig for your adapter which is neatly hidden in your suitcase, plug the phone and tell people your plane landed safely. Now you want to look for a restaurant, you take out your laptop and again! The battery is empty from the movies you watched in the plane. You take the adapter out and how do you charge your computer with your adapter busy charging your phone? Will you manage to find a decent restaurant just with your smartphone? Alternatively, have you ever tried to ask for an iPhone cable when all the people around you have Android phones, or the other way around (try it, reactions go on a scale of raised eyebrows to phone proselytism)? Chances are high you will pretty soon end up with a dead battery.

Standards permit interoperability

Thanks to common standards you can easily find AA batteries for your wireless mouse anywhere in the world and the same goes for using your mobile phone. You can make sure people can open the documents you send them if you save them as PDF, because it’s an ISO standard.

The most important aspect of a standard is that it permits interoperability: you can make sure that your product can be used by everyone. If you sell laptop chargers with Australian plugs in the UK, nobody will be able to use it.

In the world of translation, most Computer-aided translation tools (CAT tool) – whether commercial or free and open-source software – support the same standards.

Why standards are essential in translation

Translation is a collaborative effort: not a single person manages all the aspects of a translation project. With files travelling back and forth and being handled by different people, it is crucial that all speak the same language (the standard) and are able to understand each other (the interoperability). If you made a translation memory with MemoQ, it’s nice to be able to use it with OmegaT, which the TMX standard enables you to do.

In the scope of a single project not following standards and interoperability it may have limited consequences, but what would happen if someone else prepared a project their way and you have to take over that project? Chances are that the format in which you received the existing assets won’t be compatible with your workflow and it will require tedious manipulations to use it (to go back to our wall plug example, you would need to use an adapter).

Most CAT tools often use XML files in the background and are compatible with the OAXAL architecture, an initiative of the Organization for the Advancement of Structured Information Standards (OASIS), which enables “a comprehensive, efficient, and cost-effective model regarding the authoring and translation aspects of XML publishing” .

What are the most used formats in translation?

The most widely used formats of the above architecture are XLIFF, TMX and SRX. But why use these formats when most people have a word processor or spreadsheet program installed on their computers and the formats used by that software can be opened by most office suites? The answer is short: these programs cannot leverage translation assets, which can cause frustration in linguists having to work in a crowded spreadsheet and create consistency errors in the projects.

XLIFF

Let’s get a bit technical and take a look at the 3 formats highlighted above. The first one is XLIFF (XML Localization Interchange File Format), an XML-based format created to exchange information during the different steps of a translation process. This format makes a clear difference between the content that is to be translated and other information such as metadata, which isn’t. Using XLIFF allows a high level of interoperability between the different CAT tools, whether they are commercial or free software.

TMX

TMX, or Translation Memory Exchange, allows for the storage of segment pairs (usually made when translating an XLIFF) in a translation memory, for the purpose of reusing them, either inside a single project, or between different projects. The idea behind TMX is that once a segment has been translated it should not be retranslated from scratch the next time the same (or a similar) segment appears. When used in tandem with XLIFF, the bilingual content stored in the TMX contains the information stored in the XLIFF such as alternative translation, comments, annotations, etc. If a project was previously managed outside a CAT tool, a TMX can be created after the fact through alignment, a process where the source and target segments are matched manually.

SRX

Finally, SRX, or Segmentation Rules Exchange file, allows for the defining of segmentation rules. Having a specific SRX file for a project or organisation enables one to have a consistent segmentation, even if the project is prepared by multiple people or by using different programs. Using these rules one can define precisely where a text should—or should not—be segmented.

Used together, those formats can help you leverage the full potential of translation technology and make sure you can use all your assets from project to project with minimal effort. They also make translation technologists very happy as they know everything will just work out of the box.

P.S. To avoid the dodgy situation of having to charge both a phone and a laptop with one plug adapter at the same time, I recommend getting a plug adapter with a built.in USB port.

Featured Image Credit – https://xkcd.com/927/