Visual Question Answering On Vqa V1 Test Dev
Métriques
Accuracy
Résultats
Résultats de performance de divers modèles sur ce benchmark
Nom du modèle | Accuracy | Paper Title | Repository |
---|---|---|---|
DAN (ResNet) | 64.3 | Dual Attention Networks for Multimodal Reasoning and Matching | |
DMN+ | 60.3 | Dynamic Memory Networks for Visual and Textual Question Answering | |
SAAA (ResNet) | 64.5 | Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering | |
RAU (ResNet) | 63.3 | Training Recurrent Answering Units with Joint Loss Minimization for VQA | - |
NMN+LSTM+FT | 58.6 | Neural Module Networks | |
HieCoAtt (ResNet) | 61.8 | Hierarchical Question-Image Co-Attention for Visual Question Answering | |
MCB (ResNet) | 64.2 | Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding |
0 of 7 row(s) selected.