back translation webinar

Why back translation is inadequate to assess quality in a translated test

by Steve Dept, cApStAn CEO

When translating an assessment, a linguistically correct, fluent translation does not ensure that same constructs are measured, that test items are understood the same way or, for that matter, that the fairness, reliability and validity of the test have been maintained. This explains why more sophisticated translation designs are required to obtain robust translated tests. It also explains why back translation is not sufficient to evaluate translation quality. Richard Brislin described the back translation method in the 1970s (Brislin 1970, 1976), but one often forgets that he also described the shortcomings of back translation.

It is known that overly literal translations of test questions don’t work well. To take a simple example: if the question is “how many sides does a hexagon have?” and the correct translation of hexagon in the target language is “six-sided figure”, then it is obvious that the translated test question “how many sides does a six-sided figure have?” is not equivalent to the English question “how many sides does a hexagon have?”. One needs to do something about that question: one needs to adapt it, because the straightforward, correct translation results in loss of equivalence. In this case, unfortunately, back translation might do little else than confirm that the question has been translated correctly. As a rule, more literal translations score well on the back translation scale, while they often result in item bias or differential item functioning.

Likewise, a back translation will not give you enough information about fluency and appropriateness of register in the translated test. Whether a form of address is formal or informal in the translated test, it is likely to be back translated to a uniform “you” if the source language is English. Lost grammatical matches between question stem and response categories may not be detected by back translation. The ITC Guidelines for Translating and Adapting Tests (2nd Edition) say: “The main drawback of the backward translation design is that, if this design is implemented in its narrowest form, no review of the target language version of the test is ever done. The design too often results in a target language version of the test which maximizes the ease of back translation, but sometimes produces a rather awkward target language version of the test.

Indeed, if the test translation project is well prepared, and clear, user-friendly item-by-item translation and adaptation notes are developed, then one can task the translator with commenting on how s/he addressed each of these notes; and one can task the reviewer with formally confirming that each translation and adaptation note is satisfactorily addressed in the translated test. The reviewer can use back translation to document potential equivalence issues, so that the test owners are in a position to understand the issue. The reviewer can also implement proposed corrections directly in the translated test.

I believe that the enduring, overstated success of the back translation method comes from the fact that it gives test authors an illusory sense of control. While back translation will help detect mistranslations in the translated test, which the test authors can then ask to correct (and request another round of back translation), it does not always ensure that equivalence flaws are identified. It is a more effective and efficient approach to combine robust item-by-item translation and adaption notes with a high level of confidence in the reviewer’s ability to adequately report potential issues in the translated test and fix them as needed, possibly with additional guidance from the test authors.

If you’d like to find out more about alternatives to back translation to assess the quality of a translated test, do e‑mail us your queries at and we’ll get back to you as soon as we can.


  • ITC Guidelines for Translating and Adapting Tests –