Orientation-Guided Contrastive Learning for UAV-View Geo-Localisation

Retrieving relevant multimedia content is one of the main problems in a worldthat is increasingly data-driven. With the proliferation of drones, highquality aerial footage is now available to a wide audience for the first time.Integrating this footage into applications can enable GPS-less geo-localisationor location correction. In this paper, we present an orientation-guided training framework forUAV-view geo-localisation. Through hierarchical localisation orientations ofthe UAV images are estimated in relation to the satellite imagery. We propose alightweight prediction module for these pseudo labels which predicts theorientation between the different views based on the contrastive learnedembeddings. We experimentally demonstrate that this prediction supports thetraining and outperforms previous approaches. The extracted pseudo-labels alsoenable aligned rotation of the satellite image as augmentation to furtherstrengthen the generalisation. During inference, we no longer need thisorientation module, which means that no additional computations are required.We achieve state-of-the-art results on both the University-1652 andUniversity-160k datasets.