Machine Translation On Wmt2014 English German

Metrics

BLEU score

Hardware Burden

Operations per network pass

Results

Performance results of various models on this benchmark

Model Name	BLEU score	Hardware Burden	Operations per network pass	Paper Title	Repository
Transformer Big + adversarial MLE	29.52			Improving Neural Language Modeling via Adversarial Training
MAT	-	-	-	Multi-branch Attentive Transformer
AdvAug (aut+adv)	29.57	-	-	AdvAug: Robust Adversarial Augmentation for Neural Machine Translation	-
CMLM+LAT+4 iterations	27.35			Incorporating a Local Translation Mechanism into Non-autoregressive Translation	-
FlowSeq-large (IWD n = 15)	22.94			FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Transformer (ADMIN init)	30.1	-	-	Very Deep Transformers for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)	29.9			MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Transformer-DRILL Base	28.1			Deep Residual Output Layers for Neural Language Generation
Transformer Big with FRAGE	29.11			FRAGE: Frequency-Agnostic Word Representation
GLAT	25.21	-	-	Glancing Transformer for Non-Autoregressive Neural Machine Translation
PartialFormer	29.56	-	-	PartialFormer: Modeling Part Instead of Whole for Machine Translation
Bi-SimCut	30.78	-	-	Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Transformer + SRU	28.4	34G		Simple Recurrent Units for Highly Parallelizable Recurrence
PBMT	20.7			-	-
Local Joint Self-attention	29.7			Joint Source-Target Self Attention with Locality Constraints
Lite Transformer	26.5	-	-	Lite Transformer with Long-Short Range Attention
Average Attention Network (w/o FFN)	26.05	-	-	Accelerating Neural Transformer via an Average Attention Network
Unsupervised NMT + Transformer	17.16			Phrase-Based & Neural Unsupervised Machine Translation
KERMIT	28.7			KERMIT: Generative Insertion-Based Modeling for Sequences	-
T2R + Pretrain	28.7			Finetuning Pretrained Transformers into RNNs

0 of 91 row(s) selected.