HyperAIHyperAI

Open Vocabulary Semantic Segmentation On 5

Metrics

mIoU

Results

Performance results of various models on this benchmark

Model Name
mIoU
Paper TitleRepository
TCL83.2Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs-
MaskCLIP++96.8High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation-
SCAN97.2Open-Vocabulary Segmentation with Semantic-Assisted Calibration-
POMP89.4Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition-
OVSeg Swin-B94.5Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP-
EBSeg-L96.4Open-Vocabulary Semantic Segmentation with Image Embedding Balancing-
ODISE84.6Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models-
ZegFormer-Decoupling Zero-Shot Semantic Segmentation-
TagAlign(trained with image-text pairs)87.9TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification-
ZSSeg-A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model-
MAFT-ViTL92.1Learning Mask-aware CLIP Representations for Zero-Shot Segmentation-
PACL72.3Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning-
HyperSeg92.1HyperSeg: Towards Universal Visual Segmentation with Large Language Model-
MAFT+96.5Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation-
LaVG82.5In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation-
SILC97.6SILC: Improving Vision Language Pretraining with Self-Distillation-
FC-CLIP95.4Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP-
CAT-Seg97.0CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation-
MAFT-ViTL92.1--
0 of 19 row(s) selected.
Open Vocabulary Semantic Segmentation On 5 | SOTA | HyperAI