HyperAI

Language Modelling On Penn Treebank Character

Metrics

Bit per Character (BPC)
Number of params

Results

Performance results of various models on this benchmark

Model Name
Bit per Character (BPC)
Number of params
Paper TitleRepository
TCN1.315.9MSeq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Past Decode Reg. + AWD-LSTM-MoS + dyn. eval.1.16913.8MImproved Language Modeling by Decoding the Past-
2-layer Norm HyperLSTM1.21914.4MHyperNetworks
Feedback Transformer1.16010.7MAddressing Some Limitations of Transformers with Feedback Memory
Mogrifier LSTM + dynamic eval 1.08324MMogrifier LSTM
GAM-RHN-51.14716.0MRecurrent Highway Networks with Grouped Auxiliary Memory
Mogrifier LSTM1.12024MMogrifier LSTM
Seq-U-Net1.35.9MSeq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Trellis Network1.15813.4MTrellis Networks for Sequence Modeling
R-Transformer1.24-R-Transformer: Recurrent Neural Network Enhanced Transformer
6-layer QRNN1.18713.8MAn Analysis of Neural Language Modeling at Multiple Scales
IndRNN1.19-Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
Dense IndRNN1.18-Deep Independently Recurrent Neural Network (IndRNN)
Temporal Convolutional Network1.31-An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
NAS-RL1.21416.3MNeural Architecture Search with Reinforcement Learning
FS-LSTM-41.19027MFast-Slow Recurrent Neural Networks
Bipartite Flow1.38-Discrete Flows: Invertible Generative Models of Discrete Data
STAR1.30-Gating Revisited: Deep Multi-layer RNNs That Can Be Trained
3-layer AWD-LSTM1.17513.8MAn Analysis of Neural Language Modeling at Multiple Scales
FS-LSTM-21.19327MFast-Slow Recurrent Neural Networks
0 of 20 row(s) selected.