VOID Depth Estimation Dataset

VOID stands for Visual Odometry with Inertial and Depth. The dataset includes a total of 56 video sequences, of which 48 sequences (about 47k frames) are training sets and the other 8 sequences are test sets.
The dataset covers various outdoor and indoor scenes, including classrooms, offices, stairwells, laboratories, gardens, etc. Each sequence contains sparse depth maps involving three density levels (1500 points, 500 points, and 150 points), which correspond to 0.5%, 0.15%, and 0.05% of VGA size.