HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
チャート質問応答
Chart Question Answering On Chartqa
Chart Question Answering On Chartqa
評価指標
1:1 Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
1:1 Accuracy
Paper Title
ChartPaLI-5B + PaLM 2-S
81.3
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
Gemini Ultra
80.8
Gemini: A Family of Highly Capable Multimodal Models
DePlot+FlanPaLM+Codex (PoT Self-Consistency)
79.3
DePlot: One-shot visual language reasoning by plot-to-table translation
ChartPaLI-5B
77.3
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs
ScreenAI 5B (4.62 B params, w/ OCR)
76.7
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
DePlot+Codex (PoT Self-Consistency)
76.7
DePlot: One-shot visual language reasoning by plot-to-table translation
SMoLA-PaLI-X Specialist Model
74.6
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
SMoLA-PaLI-X Generalist Model
73.8
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
MatCha4096 + LaMenDa
72.64
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
PaLI-X (Single-task FT w/ OCR)
72.3
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X (Single-task FT)
70.9
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X (Multi-task FT)
70.6
PaLI-X: On Scaling up a Multilingual Vision and Language Model
DePlot+FlanPaLM (Self-Consistency)
70.5
DePlot: One-shot visual language reasoning by plot-to-table translation
PaLI-3
70
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
PaLI-3 (w/ OCR)
69.5
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
DePlot+FlanPaLM (CoT)
67.3
DePlot: One-shot visual language reasoning by plot-to-table translation
Qwen-VL-Chat
66.3
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
UniChart
66.24
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Qwen-VL
65.7
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
StructChart+GPT3.5 (STR ChartQA+SimChart9K)
65.3
StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding
0 of 27 row(s) selected.
Previous
Next