Home News Papers Tutorials Datasets Wiki SOTA LLM Models GPU Leaderboard Events

English

Unsupervised Semantic Segmentation With 7

Metrics

mIoU

Results

Performance results of various models on this benchmark

Model Name	mIoU	Paper Title	Repository
ProxyCLIP	83.3	ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
MaskCLIP	74.9	Extract Free Dense Labels from CLIP
TCL	83.2	Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
GroupViT (RedCaps)	79.7	GroupViT: Semantic Segmentation Emerges from Text Supervision
Trident	88.7	Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
TagAlign	87.9	TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification
COSMOS ViT-B/16	77.7	COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
ReCo	57.7	ReCo: Retrieve and Co-segment for Zero-shot Transfer

0 of 8 row(s) selected.