HyperAI

Language Modelling On Enwiki8

المقاييس

Bit per Character (BPC)
Number of params

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

جدول المقارنة
اسم النموذجBit per Character (BPC)Number of params
character-level-language-modeling-with-deeper1.1144M
transformer-xl-attentive-language-models1.0641M
longformer-the-long-document-transformer1.0041M
single-headed-attention-rnn-stop-thinking1.3351M
long-short-transformer-efficient-transformers0.97110M
multiplicative-lstm-for-sequence-modelling1.2446M
hypernetworks1.3427M
accessing-higher-level-representations-in0.9677M
adaptive-attention-span-in-transformers1.0239M
language-models-are-unsupervised-multitask0.931542M
transformer-xl-attentive-language-models0.99277M
cluster-former-clustering-based-sparse1.22-
transformer-xl-attentive-language-models1.0388M
compressive-transformers-for-long-range-10.97277M
generating-sequences-with-recurrent-neural1.67-
adaptive-attention-span-in-transformers0.98209M
2305-149520.94022M
dynamic-evaluation-of-transformer-language0.940277M
hierarchical-multiscale-recurrent-neural1.3235M
not-all-memories-are-created-equal-learning-10.95208M
memory-efficient-stochastic-methods-for1.03341M
when-attention-meets-fast-recurrence-training0.97108M
efficient-content-based-sparse-attention-with-10.99-
mogrifier-lstm1.19548M
improving-transformer-models-by-reordering0.968209M
the-information-pathways-hypothesis1.024-
1904105090.9995M
augmenting-self-attention-with-persistent-114M
single-headed-attention-rnn-stop-thinking1.07652M
recurrent-highway-networks1.2746M
long-short-transformer-efficient-transformers0.99-
augmenting-self-attention-with-persistent1.0139M
single-headed-attention-rnn-stop-thinking1.06854M
mogrifier-lstm1.14648M
bp-transformer-modelling-long-range-context1.0238M
hierarchical-transformers-are-more-efficient0.997-
longformer-the-long-document-transformer0.99102M
character-level-language-modeling-with-deeper1.06235M
fast-slow-recurrent-neural-networks 1.2547M
neural-machine-translation-in-linear-time1.31-
when-attention-meets-fast-recurrence-training0.95195M
an-analysis-of-neural-language-modeling-at1.23247M