HyperAI
Startseite
Neuigkeiten
Neueste Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Deutsch
HyperAI
Toggle sidebar
Seite durchsuchen…
⌘
K
Startseite
SOTA
Referring Expression Segmentation
Referring Expression Segmentation On Refcocog
Referring Expression Segmentation On Refcocog
Metriken
Overall IoU
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Overall IoU
Paper Title
Repository
MLCD-Seg-7B
79.9
Multi-label Cluster Discrimination for Visual Representation Learning
GROUNDHOG
74.1
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
-
LAVT
61.24
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
EVF-SAM
76.8
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
UniLSeg-20
78.41
Universal Segmentation at Arbitrary Granularity with Language Instruction
DETRIS
74.6
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
UniLSeg-100
79.27
Universal Segmentation at Arbitrary Granularity with Language Instruction
GLEE-Pro
72.9
General Object Foundation Model for Images and Videos at Scale
PolyFormer-L
69.2
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
MaskRIS (Swin-B)
65.55
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
SHNet
49.90
Comprehensive Multi-Modal Interactions for Referring Image Segmentation
X-Decoder (Davit-d5)
64.6
Generalized Decoding for Pixel, Image, and Language
-
MagNet
65.36
Mask Grounding for Referring Image Segmentation
SafaRi-B
70.48
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
-
VLT (Darknet53)
52.99
Vision-Language Transformer and Query Generation for Referring Segmentation
VLT (Swin-B)
63.49
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
PolyFormer-B
67.76
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
HyperSeg
79.4
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
VATEX
-
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
C3VG
74.43
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
0 of 21 row(s) selected.
Previous
Next