HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Visual Question Answering
Visual Question Answering On Vip Bench
Visual Question Answering On Vip Bench
Metrics
GPT-4 score (bbox)
GPT-4 score (human)
Results
Performance results of various models on this benchmark
Columns
Model Name
GPT-4 score (bbox)
GPT-4 score (human)
Paper Title
GPT-4V-turbo-detail:high (Visual Prompt)
60.7
59.9
GPT-4 Technical Report
GPT-4V-turbo-detail:low (Visual Prompt)
52.8
51.4
GPT-4 Technical Report
LLaVA-NeXT-Inst-IT-Qwen2-7B (Visual Prompt
50.5
49.0
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
ViP-LLaVA-13B (Visual Prompt)
48.3
48.2
Making Large Language Models Better Data Creators
LLaVA-1.5-13B (Coordinates)
47.1
-
Improved Baselines with Visual Instruction Tuning
Qwen-VL-Chat (Coordinates)
45.3
-
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
LLaVA-NeXT-Inst-IT-Vicuna-7B (Visual Prompt
45.1
48.2
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
LLaVA-1.5-13B (Visual Prompt)
41.8
42.9
Improved Baselines with Visual Instruction Tuning
Qwen-VL-Chat (Visual Prompt)
39.2
41.7
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
InstructBLIP-13B (Visual Prompt)
35.8
35.2
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
GPT4ROI 7B (ROI)
35.1
-
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shikra-7B (Coordinates)
33.7
-
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic
Kosmos-2 (Discrete Token)
26.9
-
Kosmos-2: Grounding Multimodal Large Language Models to the World
0 of 13 row(s) selected.
Previous
Next
Visual Question Answering On Vip Bench | SOTA | HyperAI