HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Multi Task Language Understanding
Multi Task Language Understanding On Mmlu
Multi Task Language Understanding On Mmlu
Métriques
Average (%)
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
Average (%)
Paper Title
Repository
ALBERT-xxlarge 223M (fine-tuned)
27.1
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
RoBERTa-base 125M (fine-tuned)
27.9
RoBERTa: A Robustly Optimized BERT Pretraining Approach
GPT-3 175B (5-shot)
43.9
Measuring Massive Multitask Language Understanding
Mixtral 8x7B (5-shot)
70.6
Mixtral of Experts
GPT 3
48.9
UnifiedQA: Crossing Format Boundaries With a Single QA System
Gopher 7.1B (5-shot)
29.5
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Atlas (5-shot)
47.9
Atlas: Few-shot Learning with Retrieval Augmented Language Models
GPT-3 175B (5-shot)
43.9
Language Models are Few-Shot Learners
GLM-130B
44.8
GLM-130B: An Open Bilingual Pre-trained Model
LLaMA 2 13B (5-shot)
54.8
Llama 2: Open Foundation and Fine-Tuned Chat Models
GPT-NeoX 20B (5-shot)
33.6
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
ds-r1(671b)
87.5
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Flan-T5-Base 250M (CoT)
33.7
Scaling Instruction-Finetuned Language Models
Claude Instant 1.1 (5-shot)
73.4
Model Card and Evaluations for Claude Models
-
llama 2(65b)
73.5
Scaling Instruction-Finetuned Language Models
Falcon 40B
57.0
The Falcon Series of Open Language Models
-
LLaMA 65B (fine-tuned)
68.9
LLaMA: Open and Efficient Foundation Language Models
Mistral 7B (5-shot)
60.1
Mistral 7B
Qwen1.5 72B (5-shot)
77.5
-
-
Qwen 7B (5-shot)
56.7
-
-
0 of 61 row(s) selected.
Previous
Next