HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Open Vocabulary Semantic Segmentation
Open Vocabulary Semantic Segmentation On 2
Open Vocabulary Semantic Segmentation On 2
Metrics
mIoU
Results
Performance results of various models on this benchmark
Columns
Model Name
mIoU
Paper Title
Mask-Adapter
38.2
Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
MaskCLIP++
38.2
High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation
CAT-Seg
37.9
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
SILC
37.7
SILC: Improving Vision Language Pretraining with Self-Distillation
MAFT+
36.1
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
OVSeg + OpenDAS
35.8
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation
SED
35.2
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
CLIPSelf
34.5
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
FC-CLIP
34.1
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
SCAN
33.5
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
EBSeg-L
32.8
Open-Vocabulary Semantic Segmentation with Image Embedding Balancing
MAFT-ViTL
32.0
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation
PACL
31.4
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
ODISE
29.9
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
OVSeg Swin-B
29.6
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
MaskCLIP
23.7
Open-Vocabulary Universal Image Segmentation with MaskCLIP
POMP
20.7
-
SimSeg
20.5
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model
TTD (TCL)
17.0
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
LaVG
15.8
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
0 of 21 row(s) selected.
Previous
Next
Open Vocabulary Semantic Segmentation On 2 | SOTA | HyperAI