HyperAI超神経

Video Question Answering On Msrvtt Qa

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Accuracy
zero-shot-video-question-answering-via-frozen47.0
mplug-2-a-modularized-multi-modal-foundation48.0
zero-shot-video-question-answering-via-frozen16.7
an-empirical-study-of-end-to-end-video44.5
revealing-single-frame-bias-for-video-and43.9
video-text-as-game-players-hierarchical46.2
valor-vision-audio-language-omni-perception49.2
revealing-single-frame-bias-for-video-and43.5
vast-a-vision-audio-subtitle-text-omni-150.1
mirasol3b-a-multimodal-autoregressive-model50.42
vindlu-a-recipe-for-effective-video-and44.6
cosa-concatenated-sample-pretrained-vision49.2
ma-lmm-memory-augmented-large-multimodal48.5
expectation-maximization-contrastive-learning45.8