Video Question Answering On Agqa 2 0 Balanced
評価指標
Average Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Average Accuracy |
---|---|
mist-multi-modal-iterative-spatial-temporal | 50.96 |
glance-and-focus-memory-prompting-for-multi-1 | 53.33 |
mist-multi-modal-iterative-spatial-temporal | 54.39 |
learning-situation-hyper-graphs-for-video | 49.2 |
glance-and-focus-memory-prompting-for-multi-1 | 48.59 |
mmtf-multi-modal-temporal-fusion-for | 44.36 |
svitt-temporal-learning-of-sparse-video-text | 52.7 |
glance-and-focus-memory-prompting-for-multi-1 | 55.08 |