HyperAI초신경

Machine Translation On Wmt2014 English German

평가 지표

BLEU score
Hardware Burden
Operations per network pass

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
BLEU score
Hardware Burden
Operations per network pass
Paper TitleRepository
Transformer Big + adversarial MLE29.52Improving Neural Language Modeling via Adversarial Training
MAT---Multi-branch Attentive Transformer
AdvAug (aut+adv)29.57--AdvAug: Robust Adversarial Augmentation for Neural Machine Translation-
CMLM+LAT+4 iterations27.35Incorporating a Local Translation Mechanism into Non-autoregressive Translation-
FlowSeq-large (IWD n = 15)22.94FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Transformer (ADMIN init)30.1--Very Deep Transformers for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)29.9MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Transformer-DRILL Base28.1Deep Residual Output Layers for Neural Language Generation
Transformer Big with FRAGE29.11FRAGE: Frequency-Agnostic Word Representation
GLAT25.21--Glancing Transformer for Non-Autoregressive Neural Machine Translation
PartialFormer29.56--PartialFormer: Modeling Part Instead of Whole for Machine Translation
Bi-SimCut30.78--Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Transformer + SRU28.434GSimple Recurrent Units for Highly Parallelizable Recurrence
PBMT20.7--
Local Joint Self-attention29.7Joint Source-Target Self Attention with Locality Constraints
Lite Transformer26.5--Lite Transformer with Long-Short Range Attention
Average Attention Network (w/o FFN)26.05--Accelerating Neural Transformer via an Average Attention Network
Unsupervised NMT + Transformer17.16Phrase-Based & Neural Unsupervised Machine Translation
KERMIT28.7KERMIT: Generative Insertion-Based Modeling for Sequences-
T2R + Pretrain28.7Finetuning Pretrained Transformers into RNNs
0 of 91 row(s) selected.