HyperAI
HyperAI
Startseite
Plattform
Dokumentation
Neuigkeiten
Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Nutzungsbedingungen
Datenschutzrichtlinie
Deutsch
HyperAI
HyperAI
Toggle Sidebar
Seite durchsuchen…
⌘
K
Command Palette
Search for a command to run...
Plattform
Startseite
SOTA
Multiple-Choice-Fragebeantwortung
Multiple Choice Question Answering Mcqa On 21
Multiple Choice Question Answering Mcqa On 21
Metriken
Dev Set (Acc-%)
Test Set (Acc-%)
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Dev Set (Acc-%)
Test Set (Acc-%)
Paper Title
Meditron-70B (CoT + SC)
66.0
-
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Codex 5-shot CoT
0.597
0.627
Can large language models reason about medical questions?
VOD (BioLinkBERT)
0.583
0.629
Variational Open-Domain Question Answering
Flan-PaLM (540B, SC)
0.576
-
Large Language Models Encode Clinical Knowledge
Flan-PaLM (540B, Few-shot)
0.565
-
Large Language Models Encode Clinical Knowledge
PaLM (540B, Few-shot)
0.545
-
Large Language Models Encode Clinical Knowledge
Flan-PaLM (540B, CoT)
0.536
-
Large Language Models Encode Clinical Knowledge
GAL 120B (zero-shot)
0.529
-
Galactica: A Large Language Model for Science
Flan-PaLM (62B, Few-shot)
0.462
-
Large Language Models Encode Clinical Knowledge
PaLM (62B, Few-shot)
0.434
-
Large Language Models Encode Clinical Knowledge
PubmedBERT(Gu et al., 2022)
0.40
0.41
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
SciBERT (Beltagy et al., 2019)
0.39
0.39
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
BioBERT (Lee et al.,2020)
0.38
0.37
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
BERT (Devlin et al., 2019)-Base
0.35
0.33
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
Flan-PaLM (8B, Few-shot)
0.345
-
Large Language Models Encode Clinical Knowledge
BLOOM (few-shot, k=5)
0.325
-
Galactica: A Large Language Model for Science
OPT (few-shot, k=5)
0.296
-
Galactica: A Large Language Model for Science
PaLM (8B, Few-shot)
0.267
-
Large Language Models Encode Clinical Knowledge
Med-PaLM 2 (ER)
-
0.723
Towards Expert-Level Medical Question Answering with Large Language Models
BioMedGPT-10B
-
0.514
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
0 of 22 row(s) selected.
Previous
Next
Multiple Choice Question Answering Mcqa On 21 | SOTA | HyperAI