HyperAI초신경

Language Modelling On Hutter Prize

평가 지표

Bit per Character (BPC)
Number of params

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Bit per Character (BPC)
Number of params
Paper TitleRepository
Large RHN1.2746MRecurrent Highway Networks
Large FS-LSTM-41.24547MFast-Slow Recurrent Neural Networks
Transformer-XL + RMS dynamic eval0.94277MDynamic Evaluation of Transformer Language Models
18-layer Transformer-XL1.0388MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Large mLSTM +emb +WN +VD1.2446MMultiplicative LSTM for sequence modelling-
Mogrifier LSTM1.12296MMogrifier LSTM
12-layer Transformer-XL1.0641MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
64-layer Character Transformer Model1.06235MCharacter-Level Language Modeling with Deeper Self-Attention
3-layer AWD-LSTM1.23247MAn Analysis of Neural Language Modeling at Multiple Scales
Longformer Small1.0041MLongformer: The Long-Document Transformer
12-layer Character Transformer Model1.1144MCharacter-Level Language Modeling with Deeper Self-Attention
FS-LSTM-41.27727MFast-Slow Recurrent Neural Networks
mLSTM + dynamic eval1.0846MDynamic Evaluation of Neural Sequence Models
Longformer Large0.99102MLongformer: The Long-Document Transformer
RHN - depth 5 [zilly2016recurrent]1.31-Recurrent Highway Networks
24-layer Transformer-XL0.99277MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Compressive Transformer0.97-Compressive Transformers for Long-Range Sequence Modelling
Mogrifier LSTM + dynamic eval0.98896MMogrifier LSTM
0 of 18 row(s) selected.