Few-shot Object Counting and Detection

We tackle a new task of few-shot object counting and detection. Given a fewexemplar bounding boxes of a target object class, we seek to count and detectall objects of the target class. This task shares the same supervision as thefew-shot object counting but additionally outputs the object bounding boxesalong with the total object count. To address this challenging problem, weintroduce a novel two-stage training strategy and a novel uncertainty-awarefew-shot object detector: Counting-DETR. The former is aimed at generatingpseudo ground-truth bounding boxes to train the latter. The latter leveragesthe pseudo ground-truth provided by the former but takes the necessary steps toaccount for the imperfection of pseudo ground-truth. To validate theperformance of our method on the new task, we introduce two new datasets namedFSCD-147 and FSCD-LVIS. Both datasets contain images with complex scenes,multiple object classes per image, and a huge variation in object shapes,sizes, and appearance. Our proposed approach outperforms very strong baselinesadapted from few-shot object counting and few-shot object detection with alarge margin in both counting and detection metrics. The code and models areavailable at https://github.com/VinAIResearch/Counting-DETR.