Open Vocabulary Semantic Segmentation On 1

평가 지표

mIoU

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
mIoU
Paper TitleRepository
TCL33.9Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs-
PACL50.1Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning-
TaAlign(trained with image-text pairs)37.6TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification-
Mask-Adapter60.4Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation-
FC-CLIP58.4Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP-
MAFT+59.4Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation-
MAFT-ViTL58.5Learning Mask-aware CLIP Representations for Zero-Shot Segmentation-
CAT-Seg63.3CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation-
OVSeg Swin-B55.7Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP-
SED60.6SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation-
LaVG34.7In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation-
HyperSeg64.6HyperSeg: Towards Universal Visual Segmentation with Large Language Model-
TTD (TCL)37.4TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias-
SimSeg47.7A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model-
SILC63.5SILC: Improving Vision Language Pretraining with Self-Distillation-
MaskCLIP45.9Open-Vocabulary Universal Image Segmentation with MaskCLIP-
EBSeg-L60.2Open-Vocabulary Semantic Segmentation with Image Embedding Balancing-
CLIP Surgery (original CLIP without any fine-tuning)29.3A Closer Look at the Explainability of Contrastive Language-Image Pre-training-
MaskCLIP++62.5High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation-
SCAN59.3Open-Vocabulary Segmentation with Semantic-Assisted Calibration-
0 of 23 row(s) selected.
Open Vocabulary Semantic Segmentation On 1 | SOTA | HyperAI초신경