Referring Expression Segmentation On Refer 1
評価指標
F
J
Ju0026F
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | F | J | Ju0026F |
---|---|---|---|
end-to-end-referring-video-object | 56.64 | 54.00 | 55.32 |
language-as-queries-for-referring-video | 56.6 | 54.8 | 55.6 |
referdino-referring-video-object-segmentation | 71.5 | 67.0 | 69.3 |
multi-level-representation-learning-with | 48.43 | 50.96 | 49.70 |
general-object-foundation-model-for-images | 72.9 | 68.2 | 70.6 |
urvos-unified-referring-video-object | 50.8 | 47.0 | 48.9 |
mpg-sam-2-adapting-sam-2-with-mask-priors-and | 76.1 | 71.7 | 73.9 |
towards-temporally-consistent-referring-video | 68.9 | 65.3 | 67.1 |
villa-video-reasoning-segmentation-with-large | 68.6 | 64.6 | 66.5 |
uniref-segment-every-reference-object-in | 69.0 | 64.8 | 66.9 |
soc-semantic-assisted-object-cluster-for | 60.5 | 57.8 | 59.2 |
universal-instance-perception-as-object | 72.7 | 67.6 | 70.1 |
universal-segmentation-at-arbitrary | 67.0 | 62.8 | 64.9 |
groprompt-efficient-grounded-prompting-and | 66.9 | 64.1 | 65.5 |
the-devil-is-in-temporal-token-high-quality | 73.1 | 69 | 71 |
r-2vos-robust-referring-video-object | 63.1 | 59.6 | 61.3 |
referred-by-multi-modality-a-unified-temporal | 70.4 | 66.4 | 68.4 |
soc-semantic-assisted-object-cluster-for | 69.3 | 65.3 | 67.3±0.5 |
vlt-vision-language-transformer-and-query | 65.6 | 61.9 | 63.8 |
losh-long-short-text-joint-prediction-network | 66.0 | 62.5 | 64.2 |
local-global-context-aware-transformer-for | 51.1 | 48.8 | 50 |
internvideo2-5-empowering-video-mllms-with | - | - | 34.2 |
language-as-queries-for-referring-video | 58.4 | 56.1 | 57.3 |
segment-every-reference-object-in-spatial-and | 69.2 | 65.5 | 67.4 |
epcformer-expression-prompt-collaboration | 67.2 | 62.9 | 65 |
driving-referring-video-object-segmentation | 69.8 | 65.3 | 67.6 |
onlinerefer-a-simple-online-baseline-for | 65.5 | 61.6 | 63.5 |
multi-attention-network-for-compressed-video | 56.51 | 54.75 | 55.63 |
spectrum-guided-multi-granularity-referring | 67.4 | 63.9 | 65.7 |
univs-unified-and-universal-video | 59.5 | 56.8 | 58.0 |
tracking-anything-with-decoupled-video | - | - | 66.0 |
decoupling-static-and-hierarchical-motion | 69.1 | 65 | 67.1 |
deeply-interleaved-two-stream-encoder-for | 50.67 | 48.44 | 49.56 |