HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Machine Translation
Machine Translation On Wmt2014 English German
Machine Translation On Wmt2014 English German
평가 지표
BLEU score
Hardware Burden
Operations per network pass
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
BLEU score
Hardware Burden
Operations per network pass
Paper Title
Repository
Transformer Big + adversarial MLE
29.52
Improving Neural Language Modeling via Adversarial Training
MAT
-
-
-
Multi-branch Attentive Transformer
AdvAug (aut+adv)
29.57
-
-
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
-
CMLM+LAT+4 iterations
27.35
Incorporating a Local Translation Mechanism into Non-autoregressive Translation
-
FlowSeq-large (IWD n = 15)
22.94
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Transformer (ADMIN init)
30.1
-
-
Very Deep Transformers for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)
29.9
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Transformer-DRILL Base
28.1
Deep Residual Output Layers for Neural Language Generation
Transformer Big with FRAGE
29.11
FRAGE: Frequency-Agnostic Word Representation
GLAT
25.21
-
-
Glancing Transformer for Non-Autoregressive Neural Machine Translation
PartialFormer
29.56
-
-
PartialFormer: Modeling Part Instead of Whole for Machine Translation
Bi-SimCut
30.78
-
-
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Transformer + SRU
28.4
34G
Simple Recurrent Units for Highly Parallelizable Recurrence
PBMT
20.7
-
-
Local Joint Self-attention
29.7
Joint Source-Target Self Attention with Locality Constraints
Lite Transformer
26.5
-
-
Lite Transformer with Long-Short Range Attention
Average Attention Network (w/o FFN)
26.05
-
-
Accelerating Neural Transformer via an Average Attention Network
Unsupervised NMT + Transformer
17.16
Phrase-Based & Neural Unsupervised Machine Translation
KERMIT
28.7
KERMIT: Generative Insertion-Based Modeling for Sequences
-
T2R + Pretrain
28.7
Finetuning Pretrained Transformers into RNNs
0 of 91 row(s) selected.
Previous
Next