HyperAI초신경

Video Question Answering On Tvbench

평가 지표

Average Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름Average Accuracy
aria-an-open-multimodal-native-mixture-of51.0
pllava-parameter-free-llava-extension-from-142.3
mplug-owl3-towards-long-image-sequence42.2
tarsier-recipes-for-training-and-evaluating-146.9
video-instruction-tuning-with-synthetic-data45.6
pllava-parameter-free-llava-extension-from-134.9
internlm-xcomposer-2-5-a-versatile-large51.6
qwen2-vl-enhancing-vision-language-model-s52.7
qwen2-vl-enhancing-vision-language-model-s43.8
video-instruction-tuning-with-synthetic-data50.0
st-llm-large-language-models-are-effective-135.7
gpt-4o-system-card39.9
videollama-2-advancing-spatial-temporal48.4
pllava-parameter-free-llava-extension-from-136.4
videogpt-integrating-image-and-video-encoders41.7
videollama-2-advancing-spatial-temporal42.9
videollama-2-advancing-spatial-temporal42.1
tarsier2-advancing-large-vision-language54.7
mvbench-a-comprehensive-multi-modal-video35.0
gemini-1-5-unlocking-multimodal-understanding47.6
tarsier-recipes-for-training-and-evaluating-155.5