HyperAI
HyperAI
Startseite
Plattform
Dokumentation
Neuigkeiten
Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Nutzungsbedingungen
Datenschutzrichtlinie
Deutsch
HyperAI
HyperAI
Toggle Sidebar
Seite durchsuchen…
⌘
K
Command Palette
Search for a command to run...
Plattform
Startseite
SOTA
Sprachmodellierung
Language Modelling On Wikitext 103
Language Modelling On Wikitext 103
Metriken
Number of params
Test perplexity
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Number of params
Test perplexity
Paper Title
LSTM
-
48.7
Improving Neural Language Models with a Continuous Cache
Temporal CNN
-
45.2
Convolutional Sequence Modeling Revisited
TCN
-
45.19
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
GCNN-8
-
44.9
Language Modeling with Gated Convolutional Networks
Neural cache model (size = 100)
-
44.8
Improving Neural Language Models with a Continuous Cache
Neural cache model (size = 2,000)
-
40.8
Improving Neural Language Models with a Continuous Cache
GPT-2 Small
124M
37.50
Language Models are Unsupervised Multitask Learners
GCNN-8
-
37.2
Language Modeling with Gated Convolutional Networks
LSTM
-
36.4
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian)
-
34.3
Fast Parametric Learning with Activation Memorization
4 layer QRNN
151M
33.0
An Analysis of Neural Language Modeling at Multiple Scales
AWD-LSTM-MoS + ATOI
-
32.85
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
DEQ-Transformer (small)
138M
32.4
Deep Equilibrium Models
LSTM (RMC)
-
31.6
Relational recurrent neural networks
Primal.+Trans.
-
31.0
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Rfa-Gate-Gaussian-Stateful (Small)
-
30.5
Random Feature Attention
LSTM (Hebbian, Cache)
-
29.7
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian, Cache, MbPA)
-
29.2
Fast Parametric Learning with Activation Memorization
Trellis Network
-
29.19
Trellis Networks for Sequence Modeling
DEQ-TrellisNet
180M
29.0
Deep Equilibrium Models
0 of 89 row(s) selected.
Previous
Next
Language Modelling On Wikitext 103 | SOTA | HyperAI