HyperAI초신경

Zero Shot Video Question Answer On Egoschema 1

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름Accuracy
timechat-a-time-sensitive-multimodal-large33.0
mvbench-a-comprehensive-multi-modal-video54.4
tarsier-recipes-for-training-and-evaluating-161.7
a-simple-llm-framework-for-long-range-video50.3
internvideo-general-video-foundation-models32.1
videotree-adaptive-tree-based-video61.1
vamos-versatile-action-models-for-video53.6
understanding-long-videos-in-one-multimodal37.6
self-chained-image-language-model-for-video-122.7
too-many-frames-not-all-useful-efficient61.1
bimba-selective-scan-compression-for-long71.14
모델 1220.0
vamos-versatile-action-models-for-video48.3
mvbench-a-comprehensive-multi-modal-video55.8
videollama-2-advancing-spatial-temporal63.9
video-rag-visually-aligned-retrieval66.7
mplug-owl-modularization-empowers-large31.1
linvt-empower-your-image-level-large-language69.5
traveler-a-multi-lmm-agent-framework-for53.3
mvbench-a-comprehensive-multi-modal-video56.7
a-simple-llm-framework-for-long-range-video33.5
video-recap-recursive-captioning-of-hour-long50.23
zero-shot-video-question-answering-via-frozen26.9
vamos-versatile-action-models-for-video36.7
language-repository-for-long-video41.2
internvideo2-scaling-video-foundation-models60.2
longvu-spatiotemporal-adaptive-compression67.6