Video Question Answering On Agqa 2 0 Balanced
Metrics
Average Accuracy
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Average Accuracy |
---|---|
mist-multi-modal-iterative-spatial-temporal | 50.96 |
glance-and-focus-memory-prompting-for-multi-1 | 53.33 |
mist-multi-modal-iterative-spatial-temporal | 54.39 |
learning-situation-hyper-graphs-for-video | 49.2 |
glance-and-focus-memory-prompting-for-multi-1 | 48.59 |
mmtf-multi-modal-temporal-fusion-for | 44.36 |
svitt-temporal-learning-of-sparse-video-text | 52.7 |
glance-and-focus-memory-prompting-for-multi-1 | 55.08 |