Transformer Big | 41.0 | Attention Is All You Need | |
Transformer Base | 38.1 | Attention Is All You Need | |
Transformer+BT (ADMIN init) | 46.4 | Very Deep Transformers for Neural Machine Translation | |
Noisy back-translation | 45.6 | Understanding Back-Translation at Scale | |
Transformer (big) + Relative Position Representations | 41.5 | Self-Attention with Relative Position Representations | |
Unsupervised attentional encoder-decoder + BPE | 14.36 | Unsupervised Neural Machine Translation | |