Addressing suitability, sensitivity and portability of constructs in tests developed in a single language (and delivered across different countries — or not)

by Pisana Ferrari – cApStAn Ambassador to the Global Village

Standardized tests are used in a variety of contexts: admission to university, certification or hiring. Fairness, validity and reliability are objectives that every testing organisation claims or strives to address. Leaving aside the challenges of delivering assessments in multiple languages, how do you address cultural suitability, cultural sensitivity and the portability of constructs across regions, cultures, and social-economic backgrounds in tests delivered in a single language? Even if an assessment is prepared for UK or US English, but will then be used in Ireland, Singapore, Australia or South Africa, the English source must be adapted to produce a version that is suitable for each country. Culture may cause perception shifts that can threaten the defensibility of an assessment.

In a “Peas in a Pod” live conversation at ATP Innovations in Testing 2021, our CEO Steve Dept described some of the issues that a cultural review can help detect.

He explained that is essential to examine the setting, the social markers in the authentic context of the assessment, the register, the reading load, cultural reference points, and stereotypes. The level of reading proficiency required or contextual elements are likely to introduce construct-irrelevant variance.

To illustrate this in a broader sense, Steve gave the following example for the audience to react on:

An IT Programming Certification Test that is language-agnostic sets a subset of programming tasks in a sports context (baseball, ski and formula 1 racing). The purpose is to measure the programming skills of the candidates. Some may have a more limited proficiency in English, so the test developers stayed clear of what they thought was complex phraseology or less frequently used words.

However, the ski slope, slalom gates and track environment is totally unfamiliar to over 20% of your candidates. They will find it more challenging to picture the problem you are asking them to solve by writing an algorithm. In other words, they may be put at a disadvantage. Cross-cultural suitability is essential, not

only for the functioning of, but also for the fairness of any measurement instrument measurement. The problem can be solved in several ways:

  • You prepare several versions, e.g., one version with ski and another with skateboarding (one with baseball, one with cricket), and you define a criterion to allocate one or the other to the candidate (e.g., the region)
  • You review your test to attain the most neutral context possible. Use “a device”, “a ball”, “speed” rather than an authentic context.
  • Or you choose the opposite road: balance out authentic contexts of different flavours so that any candidate is likely to find ~15% of the tasks unfamiliar.

How would you go about addressing this?

Try a cultural review!

The linguistic quality assurance (LQA) process at cApStAn involves professional linguists who belong to each target population of the survey or assessment under  development. We guide these linguists to evaluate the suitability of the master version for their local economy, their education system, social context, for sociocultural appropriateness in their region, and any other relevant parameters.

The outcome of this LQA process includes translation notes and adaptation guidelines for each regional version of an instrument, as well as recommendations to amend the master version—without loss of meaning—so as to avoid any culturally inappropriate or insensitive elements.

