HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Referring Expression Segmentation
Referring Expression Segmentation On Refcoco 5
Referring Expression Segmentation On Refcoco 5
Metrics
Overall IoU
Results
Performance results of various models on this benchmark
Columns
Model Name
Overall IoU
Paper Title
Repository
MaIL
56.06
MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation
-
HyperSeg
75.2
HyperSeg: Towards Universal Visual Segmentation with Large Language Model
UNINEXT-H
66.22
Universal Instance Perception as Object Discovery and Retrieval
MaskRIS (Swin-B)
59.39
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
PolyFormer-L
61.87
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
MattNet
40.08
MAttNet: Modular Attention Network for Referring Expression Comprehension
VLT
56.92
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
CRIS
53.68
CRIS: CLIP-Driven Referring Image Segmentation
VATEX
-
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
SafaRi-B
64.88
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
-
MagNet
58.14
Mask Grounding for Referring Image Segmentation
CMSA
37.89
Cross-Modal Self-Attention Network for Referring Image Segmentation
EVF-SAM
70.1
EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
STEP (5-fold)
40.41
See-Through-Text Grouping for Referring Image Segmentation
-
PolyFormer-B
59.33
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
DETRIS
70.2
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
LAVT
55.1
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
UniLSeg-20
66.99
Universal Segmentation at Arbitrary Granularity with Language Instruction
UniLSeg-100
68.15
Universal Segmentation at Arbitrary Granularity with Language Instruction
MaskRIS (Swin-B, combined DB)
62.83
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
0 of 29 row(s) selected.
Previous
Next