HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
Machine Translation
Machine Translation On Wmt2014 English German
Machine Translation On Wmt2014 English German
評価指標
BLEU score
Hardware Burden
Operations per network pass
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
BLEU score
Hardware Burden
Operations per network pass
Paper Title
Repository
Transformer Big + adversarial MLE
29.52
Improving Neural Language Modeling via Adversarial Training
MAT
-
-
-
Multi-branch Attentive Transformer
AdvAug (aut+adv)
29.57
-
-
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
-
CMLM+LAT+4 iterations
27.35
Incorporating a Local Translation Mechanism into Non-autoregressive Translation
-
FlowSeq-large (IWD n = 15)
22.94
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Transformer (ADMIN init)
30.1
-
-
Very Deep Transformers for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)
29.9
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Transformer-DRILL Base
28.1
Deep Residual Output Layers for Neural Language Generation
Transformer Big with FRAGE
29.11
FRAGE: Frequency-Agnostic Word Representation
GLAT
25.21
-
-
Glancing Transformer for Non-Autoregressive Neural Machine Translation
PartialFormer
29.56
-
-
PartialFormer: Modeling Part Instead of Whole for Machine Translation
Bi-SimCut
30.78
-
-
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Transformer + SRU
28.4
34G
Simple Recurrent Units for Highly Parallelizable Recurrence
PBMT
20.7
-
-
Local Joint Self-attention
29.7
Joint Source-Target Self Attention with Locality Constraints
Lite Transformer
26.5
-
-
Lite Transformer with Long-Short Range Attention
Average Attention Network (w/o FFN)
26.05
-
-
Accelerating Neural Transformer via an Average Attention Network
Unsupervised NMT + Transformer
17.16
Phrase-Based & Neural Unsupervised Machine Translation
KERMIT
28.7
KERMIT: Generative Insertion-Based Modeling for Sequences
-
T2R + Pretrain
28.7
Finetuning Pretrained Transformers into RNNs
0 of 91 row(s) selected.
Previous
Next