HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
서비스 약관
개인정보 처리방침
한국어
HyperAI
HyperAI초신경
Toggle Sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
플랫폼
홈
SOTA
차트 질문 응답
Chart Question Answering On Chartqa
Chart Question Answering On Chartqa
평가 지표
1:1 Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
1:1 Accuracy
Paper Title
ChartPaLI-5B + PaLM 2-S
81.3
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
Gemini Ultra
80.8
Gemini: A Family of Highly Capable Multimodal Models
DePlot+FlanPaLM+Codex (PoT Self-Consistency)
79.3
DePlot: One-shot visual language reasoning by plot-to-table translation
ChartPaLI-5B
77.3
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
ScreenAI 5B (4.62 B params, w/ OCR)
76.7
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
DePlot+Codex (PoT Self-Consistency)
76.7
DePlot: One-shot visual language reasoning by plot-to-table translation
SMoLA-PaLI-X Specialist Model
74.6
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
SMoLA-PaLI-X Generalist Model
73.8
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
MatCha4096 + LaMenDa
72.64
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
PaLI-X (Single-task FT w/ OCR)
72.3
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X (Single-task FT)
70.9
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X (Multi-task FT)
70.6
PaLI-X: On Scaling up a Multilingual Vision and Language Model
DePlot+FlanPaLM (Self-Consistency)
70.5
DePlot: One-shot visual language reasoning by plot-to-table translation
PaLI-3
70
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
PaLI-3 (w/ OCR)
69.5
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
DePlot+FlanPaLM (CoT)
67.3
DePlot: One-shot visual language reasoning by plot-to-table translation
Qwen-VL-Chat
66.3
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
UniChart
66.24
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Qwen-VL
65.7
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
StructChart+GPT3.5 (STR ChartQA+SimChart9K)
65.3
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding
0 of 27 row(s) selected.
Previous
Next