HyperAI
HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Modélisation linguistique
Language Modelling On One Billion Word
Language Modelling On One Billion Word
Métriques
Number of params
PPL
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
Number of params
PPL
Paper Title
Repository
OmniNetT (Large)
100M
21.5
OmniNet: Omnidirectional Representations from Transformers
-
LSTM-8192-1024 + CNN Input
1.04B
30.0
Exploring the Limits of Language Modeling
-
Cohere Large
-
25.06
-
-
Transformer-XL Large
0.8B
21.8
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
-
Transformer-XL Base
0.46B
23.5
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
-
Adaptive Input Large
0.46B
23.91
Adaptive Input Representations for Neural Language Modeling
-
DynamicConv
0.34B
26.67
Pay Less Attention with Lightweight and Dynamic Convolutions
-
Adaptive Input Very Large
1.0B
23.02
Adaptive Input Representations for Neural Language Modeling
-
RNN-1024 + 9 Gram
20B
51.3
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
-
OmniNetB (Large)
-
22
OmniNet: Omnidirectional Representations from Transformers
-
GPT-2
1.54B
42.16
Language Models are Unsupervised Multitask Learners
-
Evolved Transformer Big
-
28.6
The Evolved Transformer
-
OmniNetP (Large)
100M
21.6
OmniNet: Omnidirectional Representations from Transformers
-
Mesh Tensorflow
4.9B
24.0
Mesh-TensorFlow: Deep Learning for Supercomputers
-
SRU++ Large
465M
23.5
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
-
Low-Budget MoE
5B
34.1
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
-
BIG G-LSTM-2
-
36.0
Factorization tricks for LSTM networks
-
LSTM-8192-1024
1.8B
30.6
Exploring the Limits of Language Modeling
-
GCNN-14 bottleneck
-
31.9
Language Modeling with Gated Convolutional Networks
-
Sparse Non-Negative
33B
52.9
Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation
-
0 of 27 row(s) selected.
Previous
Next
Language Modelling On One Billion Word | SOTA | HyperAI