Visual Question Answering On Vqa V2 Test Dev 1
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | Accuracy |
---|---|
florence-a-new-foundation-model-for-computer | 80.16 |
lxmert-model-compression-for-visual-question | 70.72 |
differentiable-outlier-detection-enable | 76.8 |
blip-2-bootstrapping-language-image-pre | 82.30 |
blip-2-bootstrapping-language-image-pre | 81.74 |
learning-to-localize-objects-improves-spatial | 56.2 |
blip-2-bootstrapping-language-image-pre | 81.66 |
unifying-architectures-tasks-and-modalities | 82.0 |
模型 9 | 77.69 |
coca-contrastive-captioners-are-image-text | 82.3 |
mplug-2-a-modularized-multi-modal-foundation | 81.11 |