HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
ビジュアルクエスチョンアンサリング
Visual Question Answering On A Okvqa
Visual Question Answering On A Okvqa
評価指標
DA VQA Score
MC Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
DA VQA Score
MC Accuracy
Paper Title
SMoLA-PaLI-X Specialist Model
70.55
83.75
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
PaLI-X-VPD
68.2
80.4
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
PromptCap
59.6
73.2
PromptCap: Prompt-Guided Task-Aware Image Captioning
Prophet
58.5
75.1
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
A Simple Baseline for KB-VQA
57.5
-
A Simple Baseline for Knowledge-Based Visual Question Answering
KRISP
42.2
42.2
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
GPV-2
40.7
53.7
Webly Supervised Concept Expansion for General Purpose Vision Models
VLC-BERT
38.05
-
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
LXMERT
25.9
41.6
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
ViLBERT
25.9
41.5
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Pythia
21.9
40.1
Pythia v0.1: the Winning Entry to the VQA Challenge 2018
ViLBERT - VQA
12.0
42.1
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
ViLBERT - OK-VQA
9.2
34.1
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
MC-CoT
-
71
Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training
HYDRA
-
56.35
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning
0 of 15 row(s) selected.
Previous
Next
Visual Question Answering On A Okvqa | SOTA | HyperAI超神経