HyperAI

Referring Expression Segmentation On Refcocog

Métriques

Overall IoU

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
Overall IoU
Paper TitleRepository
MLCD-Seg-7B79.9Multi-label Cluster Discrimination for Visual Representation Learning
GROUNDHOG74.1GROUNDHOG: Grounding Large Language Models to Holistic Segmentation-
LAVT61.24LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
EVF-SAM76.8EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model
UniLSeg-2078.41Universal Segmentation at Arbitrary Granularity with Language Instruction
DETRIS74.6Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
UniLSeg-10079.27Universal Segmentation at Arbitrary Granularity with Language Instruction
GLEE-Pro72.9General Object Foundation Model for Images and Videos at Scale
PolyFormer-L69.2PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
MaskRIS (Swin-B)65.55MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
SHNet49.90Comprehensive Multi-Modal Interactions for Referring Image Segmentation
X-Decoder (Davit-d5)64.6Generalized Decoding for Pixel, Image, and Language-
MagNet65.36Mask Grounding for Referring Image Segmentation
SafaRi-B70.48SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation-
VLT (Darknet53)52.99Vision-Language Transformer and Query Generation for Referring Segmentation
VLT (Swin-B)63.49VLT: Vision-Language Transformer and Query Generation for Referring Segmentation
PolyFormer-B67.76PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
HyperSeg79.4HyperSeg: Towards Universal Visual Segmentation with Large Language Model
VATEX-Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding
C3VG74.43Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
0 of 21 row(s) selected.