HyperAIHyperAI
a month ago

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

Renjie Li, Ruijie Ye, Mingyang Wu, Hao Frank Yang, Zhiwen Fan, Hezhen Hu, Zhengzhong Tu
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior
  Understanding
Abstract

Humans are integral components of the transportation ecosystem, andunderstanding their behaviors is crucial to facilitating the development ofsafe driving systems. Although recent progress has explored various aspects ofhuman behaviorx2014such as motion, trajectories, andintentionx2014a comprehensive benchmark for evaluating humanbehavior understanding in autonomous driving remains unavailable. In this work,we propose MMHU, a large-scale benchmark for human behavior analysisfeaturing rich annotations, such as human motion and trajectories, textdescription for human motions, human intention, and critical behavior labelsrelevant to driving safety. Our dataset encompasses 57k human motion clips and1.73M frames gathered from diverse sources, including established drivingdatasets such as Waymo, in-the-wild videos from YouTube, and self-collecteddata. A human-in-the-loop annotation pipeline is developed to generate richbehavior captions. We provide a thorough dataset analysis and benchmarkmultiple tasksx2014ranging from motion prediction to motiongeneration and human behavior question answeringx2014therebyoffering a broad evaluation suite. Project page :https://MMHU-Benchmark.github.io.