HyperAI

Language Modelling On Hutter Prize

Metriken

Bit per Character (BPC)
Number of params

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname
Bit per Character (BPC)
Number of params
Paper TitleRepository
Large RHN1.2746MRecurrent Highway Networks
Large FS-LSTM-41.24547MFast-Slow Recurrent Neural Networks
Transformer-XL + RMS dynamic eval0.94277MDynamic Evaluation of Transformer Language Models
18-layer Transformer-XL1.0388MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Large mLSTM +emb +WN +VD1.2446MMultiplicative LSTM for sequence modelling-
Mogrifier LSTM1.12296MMogrifier LSTM
12-layer Transformer-XL1.0641MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
64-layer Character Transformer Model1.06235MCharacter-Level Language Modeling with Deeper Self-Attention
3-layer AWD-LSTM1.23247MAn Analysis of Neural Language Modeling at Multiple Scales
Longformer Small1.0041MLongformer: The Long-Document Transformer
12-layer Character Transformer Model1.1144MCharacter-Level Language Modeling with Deeper Self-Attention
FS-LSTM-41.27727MFast-Slow Recurrent Neural Networks
mLSTM + dynamic eval1.0846MDynamic Evaluation of Neural Sequence Models
Longformer Large0.99102MLongformer: The Long-Document Transformer
RHN - depth 5 [zilly2016recurrent]1.31-Recurrent Highway Networks
24-layer Transformer-XL0.99277MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Compressive Transformer0.97-Compressive Transformers for Long-Range Sequence Modelling
Mogrifier LSTM + dynamic eval0.98896MMogrifier LSTM
0 of 18 row(s) selected.