HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
机器翻译
Machine Translation On Wmt2014 English German
Machine Translation On Wmt2014 English German
评估指标
BLEU score
Hardware Burden
Operations per network pass
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
BLEU score
Hardware Burden
Operations per network pass
Paper Title
Repository
Transformer Cycle (Rev)
35.14
-
-
Lessons on Parameter Sharing across Layers in Transformers
Noisy back-translation
35.0
146G
Understanding Back-Translation at Scale
Transformer+Rep(Uni)
33.89
Rethinking Perturbations in Encoder-Decoders for Fast Training
T5-11B
32.1
-
-
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BiBERT
31.26
-
-
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Transformer + R-Drop
30.91
49G
R-Drop: Regularized Dropout for Neural Networks
Bi-SimCut
30.78
-
-
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
BERT-fused NMT
30.75
-
-
Incorporating BERT into Neural Machine Translation
Data Diversification - Transformer
30.7
Data Diversification: A Simple Strategy For Neural Machine Translation
SimCut
30.56
-
-
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
Mask Attention Network (big)
30.4
Mask Attention Networks: Rethinking and Strengthen Transformer
Transformer (ADMIN init)
30.1
-
-
Very Deep Transformers for Neural Machine Translation
PowerNorm (Transformer)
30.1
-
-
PowerNorm: Rethinking Batch Normalization in Transformers
Depth Growing
30.07
24G
Depth Growing for Neural Machine Translation
MUSE(Parallel Multi-scale Attention)
29.9
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Evolved Transformer Big
29.8
-
-
The Evolved Transformer
OmniNetP
29.8
OmniNet: Omnidirectional Representations from Transformers
Local Joint Self-attention
29.7
Joint Source-Target Self Attention with Locality Constraints
DynamicConv
29.7
-
-
Pay Less Attention with Lightweight and Dynamic Convolutions
TaLK Convolutions
29.6
-
-
Time-aware Large Kernel Convolutions
0 of 91 row(s) selected.
Previous
Next