HyperAI

Language Modelling On Penn Treebank Word

Metrics

Params
Test perplexity
Validation perplexity

Results

Performance results of various models on this benchmark

Model Name
Params
Test perplexity
Validation perplexity
Paper TitleRepository
GL-LWGC + AWD-MoS-LSTM + dynamic eval26M46.3446.64Gradual Learning of Recurrent Neural Networks
Inan et al. (2016) - Variational RHN-66.068.1Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
AWD-LSTM-MoS + Partial Shuffle22M53.9255.89Partially Shuffling the Training Data to Improve Language Models
AWD-LSTM-DOC + Partial Shuffle23M52.053.79Partially Shuffling the Training Data to Improve Language Models
Gal & Ghahramani (2016) - Variational LSTM (medium)-79.781.9A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
adversarial + AWD-LSTM-MoS + dynamic eval22M46.0146.63Improving Neural Language Modeling via Adversarial Training
AWD-LSTM-DOC x5185M47.1748.63Direct Output Connection for a High-Rank Language Model
2-layer skip-LSTM + dropout tuning 24M55.357.1Pushing the bounds of dropout
NAS-RL25M64.0-Neural Architecture Search with Reinforcement Learning
Trellis Network-54.19-Trellis Networks for Sequence Modeling
Mogrifier LSTM + dynamic eval24M44.944.8Mogrifier LSTM
DEQ-TrellisNet24M57.1-Deep Equilibrium Models
LSTM (Bai et al., 2018)-78.93-An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
TCN14.7M108.47-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
AWD-LSTM + dynamic eval24M51.151.6Dynamic Evaluation of Neural Sequence Models
Recurrent highway networks23M65.467.9Recurrent Highway Networks
FRAGE + AWD-LSTM-MoS + dynamic eval22M46.5447.38FRAGE: Frequency-Agnostic Word Representation
AWD-LSTM + continuous cache pointer24M52.853.9Regularizing and Optimizing LSTM Language Models
Past Decode Reg. + AWD-LSTM-MoS + dyn. eval.22M47.348.0Improved Language Modeling by Decoding the Past-
Seq-U-Net14.9M107.95-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
0 of 43 row(s) selected.