Learning To Count Everything

Existing works on visual counting primarily focus on one specific category ata time, such as people, animals, and cells. In this paper, we are interested incounting everything, that is to count objects from any category given only afew annotated instances from that category. To this end, we pose counting as afew-shot regression task. To tackle this task, we present a novel method thattakes a query image together with a few exemplar objects from the query imageand predicts a density map for the presence of all objects of interest in thequery image. We also present a novel adaptation strategy to adapt our networkto any novel visual category at test time, using only a few exemplar objectsfrom the novel category. We also introduce a dataset of 147 object categoriescontaining over 6000 images that are suitable for the few-shot counting task.The images are annotated with two types of annotation, dots and bounding boxes,and they can be used for developing few-shot counting models. Experiments onthis dataset shows that our method outperforms several state-of-the-art objectdetectors and few-shot counting approaches. Our code and dataset can be foundat https://github.com/cvlab-stonybrook/LearningToCountEverything.