Wednesday, March 11, 2020

Google AI Blog: Exploring Massively Multilingual, Massive Neural Machine Translation

Very recommendable! Very well written overview by Google staff on massively multilingual neural machine translation! It incorporates some of the latest approaches of NMT. The NMT models in use by Google are absolutely gigantic!

At the beginning of this blog post you find this quite remarkable quote:
"“... perhaps the way [of translation] is to descend, from each language, down to the common base of human communication — the real but as yet undiscovered universal language — and then re-emerge by whatever particular route is convenient.” — Warren Weaver, 1949"

"Once trained using all of the available data (25+ billion examples from 103 languages), we observe strong positive transfer towards low-resource languages, dramatically improving the translation quality of 30+ languages at the tail of the distribution by an average of 5 BLEU points. ...

At least half of the 7,000 languages currently spoken will no longer exist by the end of this century*. Can multilingual machine translation come to the rescue? We see the M4 approach as a stepping stone towards serving the next 1,000 languages; starting from such multilingual models will allow us to easily extend to new languages, domains and down-stream tasks, even when parallel data is unavailable. Indeed the path is rocky, and on the road to universal MT many promising solutions appear to be interdisciplinary."


Google AI Blog: Exploring Massively Multilingual, Massive Neural Machine Translation: Posted by Ankur Bapna, Software Engineer and Orhan Firat, Research Scientist, Google Research “... perhaps the way [of translation] is t...


No comments: