HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
ホーム
SOTA
質問応答
Question Answering On Truthfulqa
Question Answering On Truthfulqa
評価指標
EM
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
EM
Paper Title
Repository
CoA
67.3
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Gopher 280B (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
LLaMA 65B
-
LLaMA: Open and Efficient Foundation Language Models
GPT-2 1.5B
-
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Shakti-LLM (2.5B)
-
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments
-
LLaMA-2-Chat-13B + Representation Control (Contrast Vector)
-
Representation Engineering: A Top-Down Approach to AI Transparency
GAL 6.7B
-
Galactica: A Large Language Model for Science
Vicuna 7B + Inference Time Intervention (ITI)
-
-
-
GAL 30B
-
Galactica: A Large Language Model for Science
GAL 1.3B
-
Galactica: A Large Language Model for Science
Gopher 7.1 (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
CoA w/o actions
63.3
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
ToT
66.6
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Gopher 7.1B (zero-shot, Our Prompt + Choices)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
LLaMa-2-7B-Chat + TruthX
-
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
GAL 120B
-
Galactica: A Large Language Model for Science
LLaMA 7B
-
LLaMA: Open and Efficient Foundation Language Models
UnifiedQA 3B
-
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Gopher 1.4 (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
GAL 125M
-
Galactica: A Large Language Model for Science
0 of 33 row(s) selected.
Previous
Next