MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

The 3D LiDAR place recognition aims to estimate a coarse localization in apreviously seen environment based on a single scan from a rotating 3D LiDARsensor. The existing solutions to this problem include hand-crafted point clouddescriptors (e.g., ScanContext, M2DP, LiDAR IRIS) and deep learning-basedsolutions (e.g., PointNetVLAD, PCAN, LPDNet, DAGC, MinkLoc3D), which are oftenonly evaluated on accumulated 2D scans from the Oxford RobotCar dataset. Weintroduce MinkLoc3D-SI, a sparse convolution-based solution that utilizesspherical coordinates of 3D points and processes the intensity of 3D LiDARmeasurements, improving the performance when a single 3D LiDAR scan is used.Our method integrates the improvements typical for hand-crafted descriptors(like ScanContext) with the most efficient 3D sparse convolutions (MinkLoc3D).Our experiments show improved results on single scans from 3D LiDARs (USydCampus dataset) and great generalization ability (KITTI dataset). Usingintensity information on accumulated 2D scans (RobotCar Intensity dataset)improves the performance, even though spherical representation doesn't producea noticeable improvement. As a result, MinkLoc3D-SI is suited for single scansobtained from a 3D LiDAR, making it applicable in autonomous vehicles.