16.06.2023

Introducing the Translation and Adaptation Chapter of the ITC/ATP Guidelines for Technology-Based Assessment

This article by Steve Dept, Founding Partner of cApStAn, was originally published on the ATP Global website and is part of a series supporting the ITC/ATP Guidelines for Technology-based Assessment. The article can be downloaded at this link

The myth that a well-crafted test with robust psychometric properties is, in theory, always portable across languages and cultures has been dispelled by scholars (Hambleton et al. (2005) and by field practitioners alike. It takes more than a professional translator, a reviewer, and a focus group to ensure and ascertain equivalence. When translating or adapting an assessment, one needs to be aware of potential bias, of construct-irrelevant variance, or of culture-driven perception shifts.

A sound translation and adaptation design can mitigate these risks, and this requires some level of sophistication. There are existing standards and guidelines that describe such designs in detail, e.g., the International Test Commission Guidelines for Translating and Adapting Tests (2018). The new ITC/ATP Guidelines for Technology-Based Assessment would not be complete without a section on global considerations, and it comes as no surprise that these considerations include a chapter on assessment translation and adaptation. The editors, Prof. Stephen G. Sireci and John Weiner envisaged this as an annotated collection of references to recognised best practices so as to provide a concise overview of what matters most in test translation and adaptation.

Making of the Translation and Adaptation Chapter

The editors invited me to compile information and write that chapter because I am a field practitioner with a linguistic background who actively contributed to shaping, testing, improving, and applying best practice in assessment translation for the past 25 years. I saw this as an opportunity to revisit the literature and to let the experience I accumulated in international large-scale assessments (ILSAs) percolate into an accessible summary. This chapter was designed to function as a springboard to essential publications, whereby the reader would be provided with an initial understanding of what to look for.

My draft chapter immensely benefited from the knowledgeable, insightful feedback of some of the sharpest minds in the industry: Kadriye Ercikan, Avi Allalouf and Maria-Elena Oliveri. All three have published about comparability issues, all three have a different mother tongue (and it is not English) and all three know that I take kindly to constructive criticism. After several iterations, we were collectively confident that the chapter had become a useful tool to set translation and linguistic quality assurance priorities, and to provide access to essential references.

Salient points

An effort was made to focus on those aspects that apply more specifically to technology-based assessments. Guideline 11.10, for example, draws the reader’s attention to the fact that “Suitability of the authoring platform should be investigated for the target languages/cultural variations envisaged and related technical challenges identified” and one of the comments suggests “exporting content from the platform and producing translations outside the platform (see translation data exchange standards).”

Of course, some of the principles listed in the Guidelines apply to all delivery modes. The time and resources allocated to preparation work, which we usually refer to as upstream work, drives quality of the translated test items and determines the level of comparability that can and should be achieved. Embedding translation and adaptation into test design is the best-case scenario. In any case, the constructs one wants to measure need to be defined and the portability of these constructs needs to be investigated beforehand.

To set translation teams up for success, one needs to provide them with adequate support. This should not always involve extensive general guidelines and abundant reference material. Much rather, item-by-item translation and adaptation notes—validated by the test developers—help linguists focus on those features that are most likely to have a bearing on equivalence.

The number of steps in a quality assurance design do not necessarily correlate with quality of the final output. It is crucial to draw up precise specifications for each step and have a standardised documentation protocol. The translation and adaptation history of each assessment item in each language should remain available to inform adjudication and, at a later stage, to allow research into the measurable effects of specific types of adaptation. Back translation, which is still widely used, only gives limited information about the suitability of a translated test for its data collection purpose.

In the chapter, we also emphasize that no translation and adaptation design would be complete without some form of piloting or field testing. It is not sufficient to pilot the source version. One needs to collect data on the translated and adapted versions of the instrument to measure equivalence.

To conclude

These twelve Guidelines for Translation and Adaptation of Technology-Based Assessments and the references they provide form a reliable starting point to plan development and quality assurance of multiple language versions (or locales, i.e., language/country combinations) of your technology-based assessment.

You can freely access and download the Guidelines from https://www.testpublishers.org/white-papers or https://www.intestcom.org/page/16 and navigate to Chapter 11 : Global Testing Considerations. The twelve Guidelines for Translation and Adaptation of Technology-Based Assessments are numbered 11.1 to 11.12.