HyperAI超神经

Language Modelling On Penn Treebank Word

评估指标

Params
Test perplexity
Validation perplexity

评测结果

各个模型在此基准测试上的表现结果

模型名称
Params
Test perplexity
Validation perplexity
Paper TitleRepository
GL-LWGC + AWD-MoS-LSTM + dynamic eval26M46.3446.64Gradual Learning of Recurrent Neural Networks
Inan et al. (2016) - Variational RHN-66.068.1Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
AWD-LSTM-MoS + Partial Shuffle22M53.9255.89Partially Shuffling the Training Data to Improve Language Models
AWD-LSTM-DOC + Partial Shuffle23M52.053.79Partially Shuffling the Training Data to Improve Language Models
Gal & Ghahramani (2016) - Variational LSTM (medium)-79.781.9A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
adversarial + AWD-LSTM-MoS + dynamic eval22M46.0146.63Improving Neural Language Modeling via Adversarial Training
AWD-LSTM-DOC x5185M47.1748.63Direct Output Connection for a High-Rank Language Model
2-layer skip-LSTM + dropout tuning 24M55.357.1Pushing the bounds of dropout
NAS-RL25M64.0-Neural Architecture Search with Reinforcement Learning
Trellis Network-54.19-Trellis Networks for Sequence Modeling
Mogrifier LSTM + dynamic eval24M44.944.8Mogrifier LSTM
DEQ-TrellisNet24M57.1-Deep Equilibrium Models
LSTM (Bai et al., 2018)-78.93-An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
TCN14.7M108.47-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
AWD-LSTM + dynamic eval24M51.151.6Dynamic Evaluation of Neural Sequence Models
Recurrent highway networks23M65.467.9Recurrent Highway Networks
FRAGE + AWD-LSTM-MoS + dynamic eval22M46.5447.38FRAGE: Frequency-Agnostic Word Representation
AWD-LSTM + continuous cache pointer24M52.853.9Regularizing and Optimizing LSTM Language Models
Past Decode Reg. + AWD-LSTM-MoS + dyn. eval.22M47.348.0Improved Language Modeling by Decoding the Past-
Seq-U-Net14.9M107.95-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
0 of 43 row(s) selected.