HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
Visual Question Answering
Visual Question Answering Vqa On Core Mm
Visual Question Answering Vqa On Core Mm
評価指標
Abductive
Analogical
Deductive
Overall score
Params
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Abductive
Analogical
Deductive
Overall score
Params
Paper Title
Repository
MiniGPT-v2
13.28
5.69
11.02
10.43
8B
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
BLIP-2-OPT2.7B
18.96
7.5
2.76
19.31
3B
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
GPT-4V
77.88
69.86
74.86
74.44
-
GPT-4 Technical Report
SPHINX v2
49.85
20.69
42.17
39.48
16B
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
InstructBLIP
37.76
20.56
27.56
28.02
8B
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Emu
36.57
18.19
28.9
28.24
14B
Emu: Generative Pretraining in Multimodality
Otter
33.64
13.33
22.49
22.69
7B
Otter: A Multi-Modal Model with In-Context Instruction Tuning
CogVLM-Chat
47.88
28.75
36.75
37.16
17B
CogVLM: Visual Expert for Pretrained Language Models
mPLUG-Owl2
20.6
7.64
23.43
20.05
7B
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
OpenFlamingo-v2
5.3
1.11
8.88
6.82
9B
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
LLaVA-1.5
47.91
24.31
30.94
32.62
13B
Improved Baselines with Visual Instruction Tuning
Qwen-VL-Chat
44.39
30.42
37.55
37.39
16B
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
LLaMA-Adapter V2
46.12
22.08
28.7
30.46
7B
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
InternLM-XComposer-VL
35.97
18.61
26.77
26.84
9B
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
0 of 14 row(s) selected.
Previous
Next