HyperAIHyperAI

Language Modelling On Wikitext 2

Métriques

Number of params
Test perplexity
Validation perplexity

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
Number of params
Test perplexity
Validation perplexity
Paper TitleRepository
adversarial + AWD-LSTM-MoS + dynamic eval35M38.6540.27Improving Neural Language Modeling via Adversarial Training-
AWD-LSTM-DOC37M58.0360.29Direct Output Connection for a High-Rank Language Model-
Mogrifier LSTM35M55.157.3Mogrifier LSTM-
GPT-2 (fine-tuned)1542M15.1715.69Hydra: A System for Large Multi-Model Deep Learning-
AWD-LSTM + dynamic eval33M44.346.4Dynamic Evaluation of Neural Sequence Models-
AWD-LSTM + ATOI33M64.7367.47Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes-
Grave et al. (2016) - LSTM-99.3-Improving Neural Language Models with a Continuous Cache-
OPT-175B (50% Sparsity)-234.77-SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot-
GPT-2 (medium)345M22.76-Language Models are Unsupervised Multitask Learners-
SparseGPT (175B, 50% Sparsity)-8.21-SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot-
FRAGE + AWD-LSTM-MoS + dynamic eval35M39.1440.85FRAGE: Frequency-Agnostic Word Representation-
GL-LWGC + AWD-MoS-LSTM + dynamic eval38M40.4642.19Gradual Learning of Recurrent Neural Networks-
AWD-FWM Schlag et al. (2020)37M61.6554.48Learning Associative Inference Using Fast Weight Memory-
GPT-2 (large)762M19.93-Language Models are Unsupervised Multitask Learners-
AWD-LSTM-DRILL34M61.964.9Deep Residual Output Layers for Neural Language Generation-
GPT-21542M18.34-Language Models are Unsupervised Multitask Learners-
AWD-LSTM + continuous cache pointer33M52.053.8Regularizing and Optimizing LSTM Language Models-
Melis et al. (2017) - 1-layer LSTM (tied)24M65.969.3On the State of the Art of Evaluation in Neural Language Models-
Inan et al. (2016) - Variational LSTM (tied) (h=650)-87.792.3Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling-
AWD-LSTM-MoS + dynamic eval35M40.6842.41Breaking the Softmax Bottleneck: A High-Rank RNN Language Model-
0 of 38 row(s) selected.
Language Modelling On Wikitext 2 | SOTA | HyperAI