COSMo: CLIP Talks on Open-Set Multi-Target Domain Adaptation

Multi-Target Domain Adaptation (MTDA) entails learning domain-invariantinformation from a single source domain and applying it to multiple unlabeledtarget domains. Yet, existing MTDA methods predominantly focus on addressingdomain shifts within visual features, often overlooking semantic features andstruggling to handle unknown classes, resulting in what is known as Open-Set(OS) MTDA. While large-scale vision-language foundation models like CLIP showpromise, their potential for MTDA remains largely unexplored. This paperintroduces COSMo, a novel method that learns domain-agnostic prompts throughsource domain-guided prompt learning to tackle the MTDA problem in the promptspace. By leveraging a domain-specific bias network and separate prompts forknown and unknown classes, COSMo effectively adapts across domain and classshifts. To the best of our knowledge, COSMo is the first method to addressOpen-Set Multi-Target DA (OSMTDA), offering a more realistic representation ofreal-world scenarios and addressing the challenges of both open-set andmulti-target DA. COSMo demonstrates an average improvement of $5.1\%$ acrossthree challenging datasets: Mini-DomainNet, Office-31, and Office-Home,compared to other related DA methods adapted to operate within the OSMTDAsetting. Code is available at: https://github.com/munish30monga/COSMo