CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition

We present CrossLoc3D, a novel 3D place recognition method that solves alarge-scale point matching problem in a cross-source setting. Cross-sourcepoint cloud data corresponds to point sets captured by depth sensors withdifferent accuracies or from different distances and perspectives. We addressthe challenges in terms of developing 3D place recognition methods that accountfor the representation gap between points captured by different sources. Ourmethod handles cross-source data by utilizing multi-grained features andselecting convolution kernel sizes that correspond to most prominent features.Inspired by the diffusion models, our method uses a novel iterative refinementprocess that gradually shifts the embedding spaces from different sources to asingle canonical space for better metric learning. In addition, we presentCS-Campus3D, the first 3D aerial-ground cross-source dataset consisting ofpoint cloud data from both aerial and ground LiDAR scans. The point clouds inCS-Campus3D have representation gaps and other features like different views,point densities, and noise patterns. We show that our CrossLoc3D algorithm canachieve an improvement of 4.74% - 15.37% in terms of the top 1 average recallon our CS-Campus3D benchmark and achieves performance comparable tostate-of-the-art 3D place recognition method on the Oxford RobotCar. The codeand CS-CAMPUS3D benchmark will be available at github.com/rayguan97/crossloc3d.