Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation

Despite progress in stereo depth estimation, omnidirectional imaging remainsunderexplored, mainly due to the lack of appropriate data. We introduceHelvipad, a real-world dataset for omnidirectional stereo depth estimation,featuring 40K video frames from video sequences across diverse environments,including crowded indoor and outdoor scenes with various lighting conditions.Collected using two 360{\deg} cameras in a top-bottom setup and a LiDAR sensor,the dataset includes accurate depth and disparity labels by projecting 3D pointclouds onto equirectangular images. Additionally, we provide an augmentedtraining set with an increased label density by using depth completion. Webenchmark leading stereo depth estimation models for both standard andomnidirectional images. The results show that while recent stereo methodsperform decently, a challenge persists in accurately estimating depth inomnidirectional imaging. To address this, we introduce necessary adaptations tostereo models, leading to improved performance.