Referring Expression Segmentation On Refcoco
Métriques
Overall IoU
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | Overall IoU |
---|---|
maskris-semantic-distortion-aware-data | 78.71 |
bi-directional-relationship-inferring-network | 61.35 |
vision-language-transformer-and-query | 65.65 |
comprehensive-multi-modal-interactions-for | 65.32 |
cris-clip-driven-referring-image-segmentation | 70.47 |
safari-adaptive-sequence-transformer-for | 77.21 |
referring-image-segmentation-via-cross-modal-1 | 61.36 |
referring-expression-object-segmentation-with | 58.90 |
maskris-semantic-distortion-aware-data | 76.49 |
evf-sam-early-vision-language-fusion-for-text | 82.1 |
cross-modal-self-attention-network-for | 58.32 |
multi-label-cluster-discrimination-for-visual | 83.6 |
hierarchical-open-vocabulary-universal-image-1 | 82.8 |
psalm-pixelwise-segmentation-with-large-multi | 83.6 |
polyformer-referring-image-segmentation-as | 75.96 |
universal-segmentation-at-arbitrary | 81.74 |
groundhog-grounding-large-language-models-to | 78.5 |
densely-connected-parameter-efficient-tuning | 81.0 |
refvos-a-closer-look-at-referring-expressions | 59.45 |
refvos-a-closer-look-at-referring-expressions | 58.65 |
vlt-vision-language-transformer-and-query | 72.96 |
mask-grounding-for-referring-image | 75.24 |
mail-a-unified-mask-image-language-trimodal | 70.13 |
referring-transformer-a-one-step-approach-to | 70.56 |
multi-task-visual-grounding-with-coarse-to | 80.89 |
bridging-vision-and-language-encoders | 71.06 |
improving-referring-image-segmentation-using | - |
polyformer-referring-image-segmentation-as | 74.82 |
unleashing-text-to-image-diffusion-models-for-1 | 73.25 |
general-object-foundation-model-for-images | 80.0 |
gres-generalized-referring-expression-1 | 73.82 |
see-through-text-grouping-for-referring-image | 56.58 |
hyperseg-towards-universal-visual | 84.8 |
mattnet-modular-attention-network-for | 56.51 |
universal-instance-perception-as-object | 82.19 |