HyperAI

Visual Question Answering On Gqa Test Dev

Metriken

Accuracy

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameAccuracy
blip-2-bootstrapping-language-image-pre34.6
blip-2-bootstrapping-language-image-pre44.7
plug-and-play-vqa-zero-shot-vqa-by-conjoining41.9
visual-program-distillation-distilling-tools67.3
lxmert-learning-cross-modality-encoder60.0
blip-2-bootstrapping-language-image-pre44.4
a-good-prompt-is-worth-millions-of-parameters29.3
blip-2-bootstrapping-language-image-pre36.4
hydra-a-hyper-agent-for-dynamic-compositional47.9
learning-by-abstraction-the-neural-state62.95
lyrics-boosting-fine-grained-language-vision62.4
blip-2-bootstrapping-language-image-pre33.9
language-conditioned-graph-networks-for55.8
coarse-to-fine-reasoning-for-visual-question72.1
blip-2-bootstrapping-language-image-pre44.2
video-lavit-unified-video-language-pre64.4
cumo-scaling-multimodal-llm-with-co-upcycled64.9