HyperAI

Zero Shot Video Question Answer On Egoschema

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Comparison Table
Model NameAccuracy
mvbench-a-comprehensive-multi-modal-video65.6
understanding-long-videos-in-one-multimodal60.3
Model 320.0
language-repository-for-long-video66.2
a-simple-llm-framework-for-long-range-video50.8
slowfast-llava-a-strong-training-free47.2
a-simple-llm-framework-for-long-range-video57.6
tarsier-recipes-for-training-and-evaluating-168.6
self-chained-image-language-model-for-video-125.7
too-many-frames-not-all-useful-efficient66.0
ts-llava-constructing-visual-tokens-through57.8
videotree-adaptive-tree-based-video66.2
mvbench-a-comprehensive-multi-modal-video63.6