Learning Spatial Similarity Distribution for Few-shot Object Counting

Few-shot object counting aims to count the number of objects in a query imagethat belong to the same class as the given exemplar images. Existing methodscompute the similarity between the query image and exemplars in the 2D spatialdomain and perform regression to obtain the counting number. However, thesemethods overlook the rich information about the spatial distribution ofsimilarity on the exemplar images, leading to significant impact on matchingaccuracy. To address this issue, we propose a network learning SpatialSimilarity Distribution (SSD) for few-shot object counting, which preserves thespatial structure of exemplar features and calculates a 4D similarity pyramidpoint-to-point between the query features and exemplar features, capturing thecomplete distribution information for each point in the 4D similarity space. Wepropose a Similarity Learning Module (SLM) which applies the efficientcenter-pivot 4D convolutions on the similarity pyramid to map differentsimilarity distributions to distinct predicted density values, therebyobtaining accurate count. Furthermore, we also introduce a Feature CrossEnhancement (FCE) module that enhances query and exemplar features mutually toimprove the accuracy of feature matching. Our approach outperformsstate-of-the-art methods on multiple datasets, including FSC-147 and CARPK.Code is available at https://github.com/CBalance/SSD.