Referring Expression Segmentation On Refcoco 4
Metrics
Overall IoU
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Overall IoU |
---|---|
multi-task-visual-grounding-with-coarse-to | 77.96 |
universal-segmentation-at-arbitrary | 78.29 |
see-through-text-grouping-for-referring-image | 52.33 |
gres-generalized-referring-expression-1 | 71.02 |
maskris-semantic-distortion-aware-data | 74.46 |
densely-connected-parameter-efficient-tuning | 78.6 |
cris-clip-driven-referring-image-segmentation | 68.08 |
mask-grounding-for-referring-image | 71.32 |
vlt-vision-language-transformer-and-query | 68.43 |
comprehensive-multi-modal-interactions-for | 58.46 |
groundhog-grounding-large-language-models-to | 75.0 |
lavt-language-aware-vision-transformer-for | 68.38 |
mail-a-unified-mask-image-language-trimodal | 65.92 |
hyperseg-towards-universal-visual | 83.5 |
safari-adaptive-sequence-transformer-for | 74.53 |
mattnet-modular-attention-network-for | 52.39 |
referring-image-segmentation-via-cross-modal-1 | 53.44 |
polyformer-referring-image-segmentation-as | 72.89 |
improving-referring-image-segmentation-using | - |
cross-modal-self-attention-network-for | 47.60 |
bi-directional-relationship-inferring-network | 52.87 |
polyformer-referring-image-segmentation-as | 74.56 |
universal-instance-perception-as-object | 76.42 |
refvos-a-closer-look-at-referring-expressions | 49.73 |
vision-language-transformer-and-query | 59.20 |
evf-sam-early-vision-language-fusion-for-text | 78.3 |
multi-label-cluster-discrimination-for-visual | 82.9 |
maskris-semantic-distortion-aware-data | 75.15 |
universal-segmentation-at-arbitrary | 77.02 |