Virtual KITTI Synthetic Video Dataset
Date
Size
Publish URL
License
CC BY-NC-SA 3.0
Categories

Virtual KITTI is a photo-realistic synthetic video dataset for learning and evaluating computer vision models for multiple video understanding tasks: object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation.
The dataset contains 50 high-resolution monocular videos (21,260 frames) generated from five different virtual worlds in urban environments under different imaging and weather conditions. These worlds were created using the Unity game engine and a novel real-to-virtual cloning method.
The synthetic videos are automatically, accurately, and comprehensively annotated for 2D and 3D multi-object tracking with category, instance, flow, and depth annotations at the pixel level.