HyperAI超神経

Language Modelling On Enwiki8

評価指標

Bit per Character (BPC)
Number of params

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Bit per Character (BPC)Number of params
character-level-language-modeling-with-deeper1.1144M
transformer-xl-attentive-language-models1.0641M
longformer-the-long-document-transformer1.0041M
single-headed-attention-rnn-stop-thinking1.3351M
long-short-transformer-efficient-transformers0.97110M
multiplicative-lstm-for-sequence-modelling1.2446M
hypernetworks1.3427M
accessing-higher-level-representations-in0.9677M
adaptive-attention-span-in-transformers1.0239M
language-models-are-unsupervised-multitask0.931542M
transformer-xl-attentive-language-models0.99277M
cluster-former-clustering-based-sparse1.22-
transformer-xl-attentive-language-models1.0388M
compressive-transformers-for-long-range-10.97277M
generating-sequences-with-recurrent-neural1.67-
adaptive-attention-span-in-transformers0.98209M
2305-149520.94022M
dynamic-evaluation-of-transformer-language0.940277M
hierarchical-multiscale-recurrent-neural1.3235M
not-all-memories-are-created-equal-learning-10.95208M
memory-efficient-stochastic-methods-for1.03341M
when-attention-meets-fast-recurrence-training0.97108M
efficient-content-based-sparse-attention-with-10.99-
mogrifier-lstm1.19548M
improving-transformer-models-by-reordering0.968209M
the-information-pathways-hypothesis1.024-
1904105090.9995M
augmenting-self-attention-with-persistent-114M
single-headed-attention-rnn-stop-thinking1.07652M
recurrent-highway-networks1.2746M
long-short-transformer-efficient-transformers0.99-
augmenting-self-attention-with-persistent1.0139M
single-headed-attention-rnn-stop-thinking1.06854M
mogrifier-lstm1.14648M
bp-transformer-modelling-long-range-context1.0238M
hierarchical-transformers-are-more-efficient0.997-
longformer-the-long-document-transformer0.99102M
character-level-language-modeling-with-deeper1.06235M
fast-slow-recurrent-neural-networks 1.2547M
neural-machine-translation-in-linear-time1.31-
when-attention-meets-fast-recurrence-training0.95195M
an-analysis-of-neural-language-modeling-at1.23247M