HyperAI

Visual Question Answering On Gqa Test Dev

Metriken

Accuracy

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnameAccuracy
a-good-prompt-is-worth-millions-of-parameters29.3
blip-2-bootstrapping-language-image-pre34.6
blip-2-bootstrapping-language-image-pre44.7
blip-2-bootstrapping-language-image-pre44.4
blip-2-bootstrapping-language-image-pre36.4
blip-2-bootstrapping-language-image-pre33.9
blip-2-bootstrapping-language-image-pre44.2
coarse-to-fine-reasoning-for-visual-question72.1
cumo-scaling-multimodal-llm-with-co-upcycled64.9
hydra-a-hyper-agent-for-dynamic-compositional47.9
language-conditioned-graph-networks-for55.8
learning-by-abstraction-the-neural-state62.95
lxmert-learning-cross-modality-encoder60.0
lyrics-boosting-fine-grained-language-vision62.4
plug-and-play-vqa-zero-shot-vqa-by-conjoining41.9
video-lavit-unified-video-language-pre64.4
visual-program-distillation-distilling-tools67.3