HyperAI

Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

Home
SOTA
Language Modelling

Language Modelling

Language modeling is the task of predicting the next word or character in a document, and trained language models can be applied to various natural language processing tasks such as text generation, text classification, and question answering. Since the 2010s, neural language models have replaced N-gram models, and after the 2020s, large language models (LLMs) have become the sole path to achieving state-of-the-art performance. The capabilities of these models are evaluated using metrics like cross-entropy and perplexity, with common datasets including WikiText-103, One Billion Word, Text8, C4, and The Pile.

Penn Treebank (Word Level)

GPT-3 (Zero-Shot)

GPT-2 (48 layers, h=1600)

Test-Time Fine-Tuning with SIFT + Llama-3.2 (3B)

SparseGPT (175B, 50% Sparsity)

GPT-3 175B (Few-Shot)

One Billion Word

OmniNetT (Large)

Penn Treebank (Character Level)

Mogrifier LSTM + dynamic eval

Transformer-XL + RMS dynamic eval

Spirit-LM (Expr.)

GLM-130B (3-shot)

FewCLUE (OCNLI-FC)

FewCLUE (EPRSTMT)

CLUE (CMRC2018)

CLUE (OCNLI_50K)

FewCLUE (CHID-FC)

Hybrid 4-gram VietMed-Train + ExtraText

FewCLUE (BUSTM)

FewCLUE (CLUEWSC-FC)

Transformer-LS (small)

Curation Corpus

USPTO Backgrounds

Ethereum Phishing Transaction Network

PTB Diagnostic ECG Database

Gutenberg PG-19

language-modeling-recommendation

Transformer-LS (small)

Arxiv HEP-TH citation graph

100 sleep nights of 8 caregivers

PubMed Cognitive Control Abstracts

PAR Transformer 24B

2000 HUB5 English

Build the Future of Artificial Intelligence

English

About

About Us Dataset Help

Products

News Tutorials Datasets Wiki

Links

© HyperAI

Discord X (formerly Twitter)

Language Modelling | SOTA | HyperAI