HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
홈
SOTA
질문 응답
Question Answering On Truthfulqa
Question Answering On Truthfulqa
평가 지표
EM
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
EM
Paper Title
Repository
CoA
67.3
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Gopher 280B (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
LLaMA 65B
-
LLaMA: Open and Efficient Foundation Language Models
GPT-2 1.5B
-
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Shakti-LLM (2.5B)
-
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments
-
LLaMA-2-Chat-13B + Representation Control (Contrast Vector)
-
Representation Engineering: A Top-Down Approach to AI Transparency
GAL 6.7B
-
Galactica: A Large Language Model for Science
Vicuna 7B + Inference Time Intervention (ITI)
-
-
-
GAL 30B
-
Galactica: A Large Language Model for Science
GAL 1.3B
-
Galactica: A Large Language Model for Science
Gopher 7.1 (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
CoA w/o actions
63.3
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
ToT
66.6
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Gopher 7.1B (zero-shot, Our Prompt + Choices)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
LLaMa-2-7B-Chat + TruthX
-
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
GAL 120B
-
Galactica: A Large Language Model for Science
LLaMA 7B
-
LLaMA: Open and Efficient Foundation Language Models
UnifiedQA 3B
-
TruthfulQA: Measuring How Models Mimic Human Falsehoods
Gopher 1.4 (zero-shot, QA prompts)
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
GAL 125M
-
Galactica: A Large Language Model for Science
0 of 33 row(s) selected.
Previous
Next