HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Referring Expression Segmentation
Referring Expression Segmentation On Refcoco 3
Referring Expression Segmentation On Refcoco 3
Metrics
Overall IoU
Results
Performance results of various models on this benchmark
Columns
Model Name
Overall IoU
Paper Title
Repository
VLT
63.53
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
CPMC
49.56
Referring Image Segmentation via Cross-Modal Progressive Comprehension
UniLSeg-20
72.70
Universal Segmentation at Arbitrary Granularity with Language Instruction
MagNet
66.16
Mask Grounding for Referring Image Segmentation
BRINet
48.57
Bi-Directional Relationship Inferring Network for Referring Image Segmentation
-
SafaRi-B
70.78
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
-
GLEE-Pro
69.6
General Object Foundation Model for Images and Videos at Scale
CMSA
43.76
Cross-Modal Self-Attention Network for Referring Image Segmentation
LAVT
62.14
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
MaskRIS (Swin-B, combined DB)
70.26
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
CRIS
62.27
CRIS: CLIP-Driven Referring Image Segmentation
UNINEXT-H
72.47
Universal Instance Perception as Object Discovery and Retrieval
VLT
55.50
Vision-Language Transformer and Query Generation for Referring Segmentation
UniLSeg-100
73.18
Universal Segmentation at Arbitrary Granularity with Language Instruction
PolyFormer-L
69.33
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
ReLA
66.04
GRES: Generalized Referring Expression Segmentation
HyperSeg
79.0
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
GROUNDHOG
70.5
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
-
DETRIS
75.2
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
MaIL
62.23
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation
-
0 of 31 row(s) selected.
Previous
Next