Language Modelling On Hutter Prize

评估指标

Bit per Character (BPC)
Number of params

评测结果

各个模型在此基准测试上的表现结果

模型名称
Bit per Character (BPC)
Number of params
Paper TitleRepository
Large RHN1.2746MRecurrent Highway Networks-
Large FS-LSTM-41.24547MFast-Slow Recurrent Neural Networks-
Transformer-XL + RMS dynamic eval0.94277MDynamic Evaluation of Transformer Language Models-
18-layer Transformer-XL1.0388MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context-
Large mLSTM +emb +WN +VD1.2446MMultiplicative LSTM for sequence modelling-
Mogrifier LSTM1.12296MMogrifier LSTM-
12-layer Transformer-XL1.0641MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context-
64-layer Character Transformer Model1.06235MCharacter-Level Language Modeling with Deeper Self-Attention-
3-layer AWD-LSTM1.23247MAn Analysis of Neural Language Modeling at Multiple Scales-
Longformer Small1.0041MLongformer: The Long-Document Transformer-
12-layer Character Transformer Model1.1144MCharacter-Level Language Modeling with Deeper Self-Attention-
FS-LSTM-41.27727MFast-Slow Recurrent Neural Networks-
mLSTM + dynamic eval1.0846MDynamic Evaluation of Neural Sequence Models-
Longformer Large0.99102MLongformer: The Long-Document Transformer-
RHN - depth 5 [zilly2016recurrent]1.31-Recurrent Highway Networks-
24-layer Transformer-XL0.99277MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context-
Compressive Transformer0.97-Compressive Transformers for Long-Range Sequence Modelling-
Mogrifier LSTM + dynamic eval0.98896MMogrifier LSTM-
0 of 18 row(s) selected.