HyperAI
HyperAI
Startseite
Plattform
Dokumentation
Neuigkeiten
Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Nutzungsbedingungen
Datenschutzrichtlinie
Deutsch
HyperAI
HyperAI
Toggle Sidebar
Seite durchsuchen…
⌘
K
Command Palette
Search for a command to run...
Plattform
Startseite
SOTA
Offene Vokabular-Semantische Segmentierung
Open Vocabulary Semantic Segmentation On 2
Open Vocabulary Semantic Segmentation On 2
Metriken
mIoU
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
mIoU
Paper Title
Mask-Adapter
38.2
Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
MaskCLIP++
38.2
High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation
CAT-Seg
37.9
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
SILC
37.7
SILC: Improving Vision Language Pretraining with Self-Distillation
MAFT+
36.1
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
OVSeg + OpenDAS
35.8
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation
SED
35.2
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
CLIPSelf
34.5
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
FC-CLIP
34.1
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
SCAN
33.5
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
EBSeg-L
32.8
Open-Vocabulary Semantic Segmentation with Image Embedding Balancing
MAFT-ViTL
32.0
Learning Mask-aware CLIP Representations for Zero-Shot Segmentation
PACL
31.4
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
ODISE
29.9
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
OVSeg Swin-B
29.6
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
MaskCLIP
23.7
Open-Vocabulary Universal Image Segmentation with MaskCLIP
POMP
20.7
-
SimSeg
20.5
A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model
TTD (TCL)
17.0
TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
LaVG
15.8
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
0 of 21 row(s) selected.
Previous
Next
Open Vocabulary Semantic Segmentation On 2 | SOTA | HyperAI