HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Visual Question Answering (VQA)
Visual Question Answering On Coco Visual 4
Visual Question Answering On Coco Visual 4
Metrics
Percentage correct
Results
Performance results of various models on this benchmark
Columns
Model Name
Percentage correct
Paper Title
MCB 7 att.
66.5
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Dual-MFA
66.09
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
QGHC+Att+Concat
65.90
Question-Guided Hybrid Convolution for Visual Question Answering
RelAtt
65.69
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
joint-loss
63.2
Training Recurrent Answering Units with Joint Loss Minimization for VQA
HQI+ResNet
62.1
Hierarchical Question-Image Co-Attention for Visual Question Answering
MRN + global features
61.8
Multimodal Residual Learning for Visual QA
DMN+ [xiong2016dynamic]
60.4
Dynamic Memory Networks for Visual and Textual Question Answering
CNN-RNN
59.5
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
FDA
59.5
A Focused Dynamic Attention Model for Visual Question Answering
SAN
58.9
Stacked Attention Networks for Image Question Answering
LSTM Q+I
58.2
VQA: Visual Question Answering
SMem-VQA
58.2
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
iBOWIMG baseline
55.9
Simple Baseline for Visual Question Answering
0 of 14 row(s) selected.
Previous
Next
Visual Question Answering On Coco Visual 4 | SOTA | HyperAI