Monday, October 12, 2020

Notes on Neural Machine Translation without Embeddings

An interesting and short paper!

It is peculiar though that the translation from foreign languages to English was significantly worse than the corresponding reverse translation.

[2008.09396] Neural Machine Translation without Embeddings To that end, we omit the input and output embeddings from a standard machine translation model, and represent text as a sequence of bytes via UTF-8 encoding, using a constant 256-dimension one-hot representation for each byte. Experiments on 10 language pairs show that removing the embedding matrix consistently improves the performance of byte-to-byte models, often outperforms character-to-character models, and sometimes even produces better translations than standard subword models.

No comments: