18.04.2019

Working at the intersection of linguistics and artificial intelligence to advance machine translation performance

Published in: Artificial intelligence, Translation technology

by Pisana Ferrari – cApStAn Ambassador to the Global Village

Chris Callison-Burch –associate professor in Computer and Information Science, University of Pennsylvania — has in past years developed novel cost- and time-saving methods to translate languages, including crowdsourcing and images. In this recent interview for “Medium” he shares a new translation method which is very promising for some of the world’s most difficult-to-translate languages. His research group used images (for instance, of a cat) plus vast quantities of crowdsourced data identifying linked words for each image, to create “reverse-engineered dictionaries” for 10.000 words in 100 languages. Images “are somehow interlingual”, he says, i.e. an image of a cat is the same whether in English or Indonesian, and simplified representations of images were used to train the model. “This language-independent way of thinking about words through their visual representations allows us to use a new type of data to learn translations.” In a recent post we mentioned research along the same lines by Chinese e-commerce giant Alibaba, which has trained a NMT system with image descriptions in multiple languages.

Medium article: https://medium.com/penn-engineering/translating-the-worlds-languages-e100f98c4c1d

cApStAn blog post on NMT research: https://www.capstan.be/promising-research-on-machine-translation-for-low-resource-languages/

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.