Zero Shot Video Question Answer On Video Mme
Metrics
Accuracy (%)
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Accuracy (%) |
---|---|
gemini-1-5-unlocking-multimodal-understanding | 66.3 |
gpt-4o-visual-perception-performance-of | 62.3 |
videollama-2-advancing-spatial-temporal | 60.9 |
vila-on-pre-training-for-visual-language | 61.4 |
video-rag-visually-aligned-retrieval | 77.4 |
Model 6 | 64.8 |
gemini-1-5-unlocking-multimodal-understanding | 71.9 |
gpt-4o-visual-perception-performance-of | 70.3 |