Language Modelling
Benchmark List
All benchmarks related to this task
100-sleep-nights-of-8-caregivers
Best model: Gpt3
Metrics
View Details
2000-hub5-english
Best model: MMLU
Metrics
View Details
big-bench-lite-1
Best model: GLM-130B (3-shot)
Metrics
View Details
c4
Best model: Primer
Metrics
View Details
clue-cmrc2018
Best model: GLM-130B
Metrics
View Details
clue-ocnli-50k
Best model: GLM-130B
Metrics
View Details
enwik8-dev
Best model: Transformer-LS (small)
Metrics
View Details
enwik8
Best model: GPT-2 (48 layers, h=1600)
Metrics
View Details
enwiki8
Best model: PAR Transformer 24B
Metrics
View Details
hutter-prize
Best model: Transformer-XL + RMS dynamic eval
Metrics
View Details
lambada
Best model: GPT-3 175B (Few-Shot)
Metrics
View Details
language-modeling-recommendation
Best model: GPT2
Metrics
View Details
one-billion-word
Best model: MDLM (AR baseline)
Metrics
View Details
openwebtext
Best model: GPT2-Hermite
Metrics
View Details
penn-treebank-character-level
Best model: Mogrifier LSTM + dynamic eval
Metrics
View Details
penn-treebank-word-level
Best model: GPT-3 (Zero-Shot)
Metrics
View Details
ptb
Best model: I-DARTS
Metrics
View Details
salmon
Best model: Spirit-LM (Expr.)
Metrics
View Details
stackexchange
Best model: Gopher
Metrics
View Details
text8
Best model: GPT-2
Metrics
View Details
text8-dev
Best model: Transformer-LS (small)
Metrics
View Details
the-pile
Best model: Test-Time Fine-Tuning with SIFT + Llama-3.2 (3B)
Metrics
View Details
vietmed
Best model: Hybrid 4-gram VietMed-Train + ExtraText
Metrics
View Details
wiki-40b
Best model: FLASH-Quad-8k
Metrics
View Details
wikitext-103
Best model: RETRO (7.5B)
Metrics
View Details
wikitext-2
Best model: SparseGPT (175B, 50% Sparsity)
Metrics
View Details
-5
Metrics
View Details
arxiv
Metrics
View Details
bookcorpus2
Metrics
View Details
books3
Metrics
View Details
clue-afqmc
Metrics
View Details
clue-c3
Metrics
View Details
clue-cmnli
Metrics
View Details
clue-drcd
Metrics
View Details
clue-wsc1-1
Metrics
View Details
curation-corpus
Metrics
View Details
dm-mathematics
Metrics
View Details
fewclue-bustm
Metrics
View Details
fewclue-chid-fc
Metrics
View Details
fewclue-cluewsc-fc
Metrics
View Details
fewclue-eprstmt
Metrics
View Details
fewclue-ocnli-fc
Metrics
View Details
freelaw
Metrics
View Details
github
Metrics
View Details
gutenberg-pg-19
Metrics
View Details
hackernews
Metrics
View Details
nih-exporter
Metrics
View Details
opensubtitles-1
Metrics
View Details
openwebtext2
Metrics
View Details
philpapers
Metrics
View Details
pile-cc
Metrics
View Details
pubmed-abstracts
Metrics
View Details
pubmed-central
Metrics
View Details
ubuntu-irc
Metrics
View Details
uspto-backgrounds
Metrics
View Details