Word-level CNN w/attn, input feeding | 24.0 | Sequence-to-Sequence Learning as Beam-Search Optimization | |
Denoising autoencoders (non-autoregressive) | 32.43 | Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement | |
Transformer with FRAGE | 33.97 | FRAGE: Frequency-Agnostic Word Representation | |