Language Modelling On Hutter Prize
Métriques
Bit per Character (BPC)
Number of params
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | Bit per Character (BPC) | Number of params |
---|---|---|
recurrent-highway-networks | 1.27 | 46M |
fast-slow-recurrent-neural-networks | 1.245 | 47M |
dynamic-evaluation-of-transformer-language | 0.94 | 277M |
transformer-xl-attentive-language-models | 1.03 | 88M |
multiplicative-lstm-for-sequence-modelling | 1.24 | 46M |
mogrifier-lstm | 1.122 | 96M |
transformer-xl-attentive-language-models | 1.06 | 41M |
character-level-language-modeling-with-deeper | 1.06 | 235M |
an-analysis-of-neural-language-modeling-at | 1.232 | 47M |
longformer-the-long-document-transformer | 1.00 | 41M |
character-level-language-modeling-with-deeper | 1.11 | 44M |
fast-slow-recurrent-neural-networks | 1.277 | 27M |
dynamic-evaluation-of-neural-sequence-models | 1.08 | 46M |
longformer-the-long-document-transformer | 0.99 | 102M |
recurrent-highway-networks | 1.31 | - |
transformer-xl-attentive-language-models | 0.99 | 277M |
compressive-transformers-for-long-range-1 | 0.97 | - |
mogrifier-lstm | 0.988 | 96M |