DrivingDojo Autonomous Driving Dataset
Date
Size
Publish URL
Categories
The DrivingDojo dataset was jointly created by the New Pattern Recognition Laboratory of the Institute of Automation, Chinese Academy of Sciences, the School of Artificial Intelligence, University of Chinese Academy of Sciences, Meituan, and the Hong Kong Center for Artificial Intelligence and Robotics, Chinese Academy of Sciences in 2024.DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model", aims to advance the development of interactive and knowledge-rich driving world models. This dataset contains about 18k video clips, specifically simulating real-world visual interactions, covering rich driving actions, multi-agent interactions, and open-world driving knowledge.
The DrivingDojo dataset is characterized by its complete action, multi-agent interaction, and rich open-world driving knowledge. It includes not only longitudinal operations such as acceleration, emergency braking, and stop-start, but also lateral operations such as U-turns, overtaking, and lane changes. In addition, the dataset is specially designed to contain videos with a large number of multi-agent interaction trajectories, such as insertions, cutoffs, and frontal merges. DrivingDojo also contains videos of rare events, such as crossing animals, falling bottles, and road debris, which are likely to be encountered in real-world driving scenarios.
The video resolution of the dataset is 1920×1080 and the frame rate is 5fps. The video clips are from major cities in China, including Beijing, Shenzhen, Xuzhou, etc., and recorded under different weather conditions and daylight conditions. All videos are paired with synchronized camera poses, which come from the high-precision positioning stack driven by the on-board HD-Map. Videos in the DrivingDojo-Open subset are also paired with text descriptions of rare events occurring in each video.
To measure the progress of driving scene modeling, the DrivingDojo dataset also proposes a new Action Instruction Following (AIF) benchmark to evaluate the ability of the world model to perform reasonable future rolling predictions. This benchmark evaluates long-term motion controllability by calculating the error between the action in the generated video and the given instruction.
Overall, the DrivingDojo dataset provides a valuable resource for the autonomous driving community, aiming to improve the prediction and control capabilities of world models in complex driving environments.
