HyperAIHyperAI
2 months ago

GOOD: Exploring Geometric Cues for Detecting Objects in an Open World

Huang, Haiwen ; Geiger, Andreas ; Zhang, Dan
GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
Abstract

We address the task of open-world class-agnostic object detection, i.e.,detecting every object in an image by learning from a limited number of baseobject classes. State-of-the-art RGB-based models suffer from overfitting thetraining classes and often fail at detecting novel-looking objects. This isbecause RGB-based models primarily rely on appearance similarity to detectnovel objects and are also prone to overfitting short-cut cues such as texturesand discriminative parts. To address these shortcomings of RGB-based objectdetectors, we propose incorporating geometric cues such as depth and normals,predicted by general-purpose monocular estimators. Specifically, we use thegeometric cues to train an object proposal network for pseudo-labelingunannotated novel objects in the training set. Our resulting Geometry-guidedOpen-world Object Detector (GOOD) significantly improves detection recall fornovel object categories and already performs well with only a few trainingclasses. Using a single "person" class for training on the COCO dataset, GOODsurpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.