An interesting and short paper!
It is peculiar though that the translation from foreign languages to English was significantly worse than the corresponding reverse translation.
[2008.09396] Neural Machine Translation without Embeddings To that end, we omit the input and output embeddings from a standard machine translation model, and represent text as a sequence of bytes via UTF-8 encoding, using a constant 256-dimension one-hot representation for each byte. Experiments on 10 language pairs show that removing the embedding matrix consistently improves the performance of byte-to-byte models, often outperforms character-to-character models, and sometimes even produces better translations than standard subword models.
No comments:
Post a Comment