Visual Question Answering On Clevr Humans
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Accuracy | Paper Title | Repository |
---|---|---|---|
MAC | 81.5 | Compositional Attention Networks for Machine Reasoning | |
CNN+GRU+FiLM | 75.9 | FiLM: Visual Reasoning with a General Conditioning Layer | |
IEP-18K | 66.6 | Inferring and Executing Programs for Visual Reasoning | |
MDETR | 81.7 | MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding | |
NS-VQA (1K programs) | 67.8 | Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding |
0 of 5 row(s) selected.