How to leverage neural machine translation in test item translation

How to leverage neural machine translation in test item translation

by Pisana Ferrari – cApStAn Ambassador to the Global Village

Advances in artificial intelligence (AI) have opened up enormous opportunities in the field of translation. Today, with a well-designed – hybrid – workflow that combines human expertise and neural machine translation (NMT), it is technically possible to produce multiple language versions of test item banks faster and at a more affordable price than three years ago. If that is what you want to strive for, certain stages cannot be skipped.

Garbage in, garbage out

First of all, one has to remember that the output of MT engines is only as good as what is fed into it. For some language combinations and some domains, the result produced by some engines is sometimes impressive. But when it comes to test content, disambiguation is key. Without external help, the machine cannot infer from the context whether a “Dutch teacher” actually teaches Dutch or is merely a Dutch citizen who teaches any subject. Test developers will need to disambiguate source material and work together with linguists to prepare contextual elements for both human translators and MT engines to interpret.

The mindset should be “upstream” work

What eats up a lot of time and resources is adding multiple reviews to reviews. Reviews by linguists, subject matter experts, proofreaders, stakeholders, censors, end users, bilingual staff at the client, you name it. If decent preparation work has been done before the translation begins, there can be more automation and less reviews during, and after, translation. More preventative linguistic quality assurance early in the process, less corrective action late in the process. Automated quality assurance checks increase consistency and take over repetitive tasks such as harmonizing quotation marks, checking whether all segments are translated, or checking adherence to a glossary. Item writers need assistance to “engineer” the source version to make it suitable for translation into other languages and to make it suitable for a man-machine translation (MMT) workflow.

How cApStAn can help you

This assistance is exactly what cApStAn has to offer to the testing industry. You need to make sure AI is fed with high quality input to harvest quality output. AI can leverage databases of known issues (and known workarounds). High quality input refers to both the source version of the text and to existing, validated translations that are fed into the system (“translation memories”). Interacting with linguists and cultural brokers to train your input data boosts efficiency. To train input data means to trim it, to program rules, to create style guides, to compile glossaries, to prepare routing, filters, dynamic text, and contextual information that can be read and interpreted by man and machine. This allows to diagnose issues and propose fixes before the MMT workflow kicks in.

Augmenting the source version before translation

A sound approach should be proactive, collaborative and interactive. Spend some time and money reviewing and augmenting the source version together with linguists/culture brokers. The interaction between psychometricians, cognitive psychologists and linguists:

  • Instils new dynamics in item writing.
  • Produces a corpus of item-by-item translation and adaptation notes.
    • cApStAn’s “Translatability Assessment” is part our Linguistic Quality Assurance process. It helps identify any potential issues upfront, before translation begins, by checking test items against a list of translatability categories, and reporting potential issues in a centralised monitoring tool. Alternative text may be proposed without loss of meaning.

      Read more about our Linguistic Quality Assurance (LQA) processes here

      Action points for the testing industry

      It is important to determine the skills that item developers (and human linguists) need to acquire/hone so that their expertise can effectively be woven into new MMT workflows.

      If they can work with experts to define the new skills that linguists will need (translation technology, natural language processing, discernment) and test linguists for these skills, they will have a new product that will be in high demand in the buoyant localization market.

      It is high time to:

      • Upgrade your item pools to translatable master versions, ready to input into AI.
      • Train a good MT engine with high-quality translations.
      • Enlist the help of experts to usefully supplement NMT in tomorrow’s Man-Machine Translation workflows.