PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition

Unlike its image based counterpart, point cloud based retrieval for placerecognition has remained as an unexplored and unsolved problem. This is largelydue to the difficulty in extracting local feature descriptors from a pointcloud that can subsequently be encoded into a global descriptor for theretrieval task. In this paper, we propose the PointNetVLAD where we leverage onthe recent success of deep networks to solve point cloud based retrieval forplace recognition. Specifically, our PointNetVLAD is a combination/modificationof the existing PointNet and NetVLAD, which allows end-to-end training andinference to extract the global descriptor from a given 3D point cloud.Furthermore, we propose the "lazy triplet and quadruplet" loss functions thatcan achieve more discriminative and generalizable global descriptors to tacklethe retrieval task. We create benchmark datasets for point cloud basedretrieval for place recognition, and the experimental results on these datasetsshow the feasibility of our PointNetVLAD. Our code and the link for thebenchmark dataset downloads are available in our project website.http://github.com/mikacuy/pointnetvlad/