HyperAI

Vcgbench Diverse

VCGBench-Diverse is a benchmark designed to comprehensively evaluate the generalization capabilities of video large language models. This benchmark includes 877 video clips, 18 broad categories, and 4,354 question-answer pairs, ensuring a robust evaluation framework. The assessment covers five aspects: information accuracy, detail orientation, context understanding, temporal understanding, and consistency, and provides performance breakdowns in three critical areas: dense video captioning, spatial understanding, and reasoning.