HyperAI

Vcgbench Diverse On Videoinstruct

Metrics

Consistency
Contextual Understanding
Correctness of Information
Dense Captioning
Detail Orientation
Reasoning
Spatial Understanding
Temporal Understanding
mean

Results

Performance results of various models on this benchmark

Comparison Table
Model NameConsistencyContextual UnderstandingCorrectness of InformationDense CaptioningDetail OrientationReasoningSpatial UnderstandingTemporal Understandingmean
one-for-all-video-conversation-is-feasible2.272.592.201.032.623.622.351.292.19
videogpt-integrating-image-and-video-encoders2.592.812.461.382.733.632.801.782.47
chat-univi-unified-visual-representation2.362.662.291.332.563.592.361.562.29
mvbench-a-comprehensive-multi-modal-video2.272.512.131.262.423.132.431.662.20
vtimellm-empower-llm-to-grasp-video-moments2.352.482.161.132.413.452.291.462.17
video-chatgpt-towards-detailed-video2.062.462.070.892.423.602.251.392.08