Referring Video Object Segmentation On Mevis
評価指標
F
J
Ju0026F
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | F | J | Ju0026F |
---|---|---|---|
multi-context-temporal-consistent-modeling | 51.1 | 44.1 | 47.6 |
towards-temporally-consistent-referring-video | 45.5 | 39.9 | 42.7 |
the-devil-is-in-temporal-token-high-quality | 53.7 | 48 | 50.9 |
internvideo2-5-empowering-video-mllms-with | - | - | 32 |
mpg-sam-2-adapting-sam-2-with-mask-priors-and | 56.7 | 50.7 | 53.7 |
language-as-queries-for-referring-video | 32.2 | 29.8 | 31.0 |
decoupling-static-and-hierarchical-motion | 49.8 | 43 | 46.4 |
samwise-infusing-wisdom-in-sam2-for-text | 51.2 | 45.4 | 48.3 |
language-bridged-spatial-temporal-interaction-1 | 30.8 | 27.8 | 29.3 |
vlt-vision-language-transformer-and-query | 37.3 | 33.6 | 35.5 |
mevis-a-large-scale-benchmark-for-video | 40.2 | 34.2 | 37.2 |
end-to-end-referring-video-object | 31.2 | 28.8 | 30.0 |
urvos-unified-referring-video-object | 29.9 | 25.7 | 27.8 |
referdino-referring-video-object-segmentation | 53.9 | 44.7 | 49.3 |
find-first-track-next-decoupling | 50.7 | 45.6 | 48.2 |