HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Visual Question Answering (VQA)
Visual Question Answering On Clevr Humans
Visual Question Answering On Clevr Humans
Metrics
Accuracy
Results
Performance results of various models on this benchmark
Columns
Model Name
Accuracy
Paper Title
MDETR
81.7
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
MAC
81.5
Compositional Attention Networks for Machine Reasoning
CNN+GRU+FiLM
75.9
FiLM: Visual Reasoning with a General Conditioning Layer
NS-VQA (1K programs)
67.8
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
IEP-18K
66.6
Inferring and Executing Programs for Visual Reasoning
0 of 5 row(s) selected.
Previous
Next
Visual Question Answering On Clevr Humans | SOTA | HyperAI