cApStAn

< BACK

Follow us on Linkedin

Working at the intersection of linguistics and artificial intelligence to advance machine translation performance

Working at the intersection of linguistics and artificial intelligence to advance machine translation performance

by Pisana Ferrari – cApStAn Ambassador to the Global Village

 

Chris Callison-Burch –associate professor in Computer and Information Science, University of Pennsylvania — has in past years developed novel cost- and time-saving methods to translate languages, including crowdsourcing and images. In this recent interview for “Medium” he shares a new translation method which is very promising for some of the world’s most difficult-to-translate languages. His research group used images (for instance, of a cat) plus vast quantities of crowdsourced data identifying linked words for each image, to create “reverse-engineered dictionaries” for 10.000 words in 100 languages. Images “are somehow interlingual”, he says, i.e. an image of a cat is the same whether in English or Indonesian, and simplified representations of images were used to train the model. “This language-independent way of thinking about words through their visual representations allows us to use a new type of data to learn translations.” In a recent post we mentioned research along the same lines by Chinese e-commerce giant Alibaba, which has trained a NMT system with image descriptions in multiple languages.

Medium article: https://medium.com/penn-engineering/translating-the-worlds-languages-e100f98c4c1d

cApStAn blog post on NMT research: http://www.capstan.be/promising-research-on-machine-translation-for-low-resource-languages/

 

< BACK

Follow us on Linkedin