HyperAIHyperAI
2 months ago

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation

Yilmaz, Gonca ; Peng, Songyou ; Pollefeys, Marc ; Engelmann, Francis ; Blum, Hermann
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation
Abstract

Recently, Vision-Language Models (VLMs) have advanced segmentation techniquesby shifting from the traditional segmentation of a closed-set of predefinedobject classes to open-vocabulary segmentation (OVS), allowing users to segmentnovel classes and concepts unseen during training of the segmentation model.However, this flexibility comes with a trade-off: fully-supervised closed-setmethods still outperform OVS methods on base classes, that is on classes onwhich they have been explicitly trained. This is due to the lack ofpixel-aligned training masks for VLMs (which are trained on image-captionpairs), and the absence of domain-specific knowledge, such as autonomousdriving. Therefore, we propose the task of open-vocabulary domain adaptation toinfuse domain-specific knowledge into VLMs while preserving theiropen-vocabulary nature. By doing so, we achieve improved performance in baseand novel classes. Existing VLM adaptation methods improve performance on base(training) queries, but fail to fully preserve the open-set capabilities ofVLMs on novel queries. To address this shortcoming, we combineparameter-efficient prompt tuning with a triplet-loss-based training strategythat uses auxiliary negative queries. Notably, our approach is the onlyparameter-efficient method that consistently surpasses the original VLM onnovel classes. Our adapted VLMs can seamlessly be integrated into existing OVSpipelines, e.g., improving OVSeg by +6.0% mIoU on ADE20K for open-vocabulary 2Dsegmentation, and OpenMask3D by +4.1% AP on ScanNet++ Offices foropen-vocabulary 3D instance segmentation without other changes. The projectpage is available at https://open-das.github.io/.

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation | Latest Papers | HyperAI