HyperAI

Language Modelling On One Billion Word

Metriken

Number of params
PPL

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameNumber of paramsPPL
omninet-omnidirectional-representations-from100M21.5
exploring-the-limits-of-language-modeling1.04B30.0
Modell 3-25.06
transformer-xl-attentive-language-models0.8B21.8
transformer-xl-attentive-language-models0.46B23.5
adaptive-input-representations-for-neural0.46B23.91
pay-less-attention-with-lightweight-and0.34B26.67
adaptive-input-representations-for-neural1.0B23.02
one-billion-word-benchmark-for-measuring20B51.3
omninet-omnidirectional-representations-from-22
language-models-are-unsupervised-multitask1.54B42.16
the-evolved-transformer-28.6
omninet-omnidirectional-representations-from100M21.6
mesh-tensorflow-deep-learning-for4.9B 24.0
when-attention-meets-fast-recurrence-training465M23.5
outrageously-large-neural-networks-the5B34.1
factorization-tricks-for-lstm-networks-36.0
exploring-the-limits-of-language-modeling1.8B30.6
language-modeling-with-gated-convolutional-31.9
skip-gram-language-modeling-using-sparse-non33B52.9
outrageously-large-neural-networks-the5B 28.0
h-transformer-1d-fast-one-dimensional53M-
when-attention-meets-fast-recurrence-training328M25.1
simple-and-effective-masked-diffusion110M20.09
simple-and-effective-masked-diffusion110M23.00
h-transformer-1d-fast-one-dimensional144M-
exploring-the-limits-of-language-modeling43B23.7