HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
ホーム
SOTA
質問応答
Question Answering On Medqa Usmle
Question Answering On Medqa Usmle
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Accuracy
Paper Title
Repository
Med-Gemini
91.1
Capabilities of Gemini Models in Medicine
-
GPT-4
90.2
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
Med-PaLM 2
85.4
Towards Expert-Level Medical Question Answering with Large Language Models
Med-PaLM 2 (CoT + SC)
83.7
Towards Expert-Level Medical Question Answering with Large Language Models
Med-PaLM 2 (5-shot)
79.7
Towards Expert-Level Medical Question Answering with Large Language Models
MedMobile (3.8B)
75.7
MedMobile: A mobile-sized language model with expert-level clinical capabilities
Meerkat-7B
74.3
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
-
Meerkat-7B (Single)
70.6
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
-
Meditron-70B (CoT + SC)
70.2
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Flan-PaLM (540 B)
67.6
Large Language Models Encode Clinical Knowledge
LLAMA-2 (70B SC CoT)
61.5
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Shakti-LLM (2.5B)
60.3
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments
-
Codex 5-shot CoT
60.2
Can large language models reason about medical questions?
LLAMA-2 (70B)
59.2
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
VOD (BioLinkBERT)
55.0
Variational Open-Domain Question Answering
BioMedGPT-10B
50.4
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
PubMedGPT (2.7 B)
50.3
Large Language Models Encode Clinical Knowledge
DRAGON + BioLinkBERT
47.5
Deep Bidirectional Language-Knowledge Graph Pretraining
BioLinkBERT (340 M)
45.1
Large Language Models Encode Clinical Knowledge
GAL 120B (zero-shot)
44.4
Galactica: A Large Language Model for Science
0 of 27 row(s) selected.
Previous
Next