Semi Supervised Video Object Segmentation On 20
Métriques
D16 val (F)
D16 val (G)
D16 val (J)
D17 test (F)
D17 test (G)
D17 test (J)
D17 val (F)
D17 val (G)
D17 val (J)
FPS
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | D16 val (F) | D16 val (G) | D16 val (J) | D17 test (F) | D17 test (G) | D17 test (J) | D17 val (F) | D17 val (G) | D17 val (J) | FPS |
---|---|---|---|---|---|---|---|---|---|---|
ranet-ranking-attention-network-for-fast | 85.4 | 85.5 | 85.5 | 57.2 | 55.3 | 53.4 | 68.2 | 65.7 | 63.2 | 30.3 |
fast-video-object-segmentation-using-the | 85.7 | 86.6 | 87.6 | - | - | - | 73.5 | 71.4 | 69.3 | 25.0 |
agss-vos-attention-guided-single-shot-video | - | - | - | 59.7 | 57.2 | 54.8 | 69.9 | 67.4 | 64.9 | 10.0 |
video-object-segmentation-using-space-time | 88.1 | 86.5 | 84.8 | - | - | - | 74.0 | 71.6 | 69.2 | 6.25 |
learning-position-and-target-consistency-for | - | - | - | - | - | - | 77.2 | 75.2 | 73.1 | 8.47 |
Modèle 6 | 86.4 | 86.1 | 85.8 | - | 55.2 | - | 71.6 | 68.5 | 65.3 | 0.92 |
efficient-regional-memory-network-for-video | 82.3 | 81.5 | 80.6 | - | - | - | 77.2 | 75.0 | 72.8 | 11.9 |
tackling-background-distraction-in-video | 86.2 | 86.8 | 87.5 | 72.2 | 69.4 | 66.6 | 82.3 | 80.0 | 77.6 | 50.1 |
learning-what-to-learn-for-video-object | - | - | - | - | - | - | 76.3 | 74.3 | 72.2 | 14.0 |
sstvos-sparse-spatiotemporal-transformers-for | - | - | - | - | - | - | 81.4 | 78.4 | 75.4 | - |
kernelized-memory-network-for-video-object | 88.1 | 87.6 | 87.1 | - | - | - | 77.8 | 76.0 | 74.2 | 8.33 |
spatiotemporal-cnn-for-video-object | 83.8 | 83.8 | 83.8 | - | - | - | 64.6 | 61.7 | 58.7 | 0.26 |
a-transductive-approach-for-video-object | - | - | - | 67.4 | 63.1 | 58.8 | 74.7 | 72.3 | 69.9 | 37.0 |
joint-inductive-and-transductive-learning-for | - | - | - | - | - | - | 81.2 | 78.6 | 76.0 | 4.00 |
spatiotemporal-graph-neural-network-based | 86.0 | 85.7 | 85.4 | 66.5 | 63.1 | 59.7 | 77.9 | 74.7 | 71.5 | - |
video-object-segmentation-with-adaptive | - | - | - | - | - | - | 76.1 | 74.6 | 73.0 | 4.00 |
hierarchical-memory-matching-network-for | 90.6 | 89.4 | 88.2 | - | - | - | 83.1 | 80.4 | 77.7 | 10.0 |
xmem-long-term-video-object-segmentation-with | - | - | - | - | - | - | - | - | - | 29.6 |
associating-objects-with-transformers-for | - | - | - | - | - | - | 82.0 | 79.2 | 76.4 | 40.0 |
collaborative-video-object-segmentation-by | 86.9 | 86.1 | 85.3 | - | - | - | 77.7 | 74.9 | 72.1 | 5.56 |
learning-fast-and-robust-target-models-for | - | 81.7 | - | - | - | - | 71.2 | 68.8 | 66.4 | 21.9 |
feelvos-fast-end-to-end-embedding-learning | 83.1 | 81.7 | 80.3 | 57.5 | 54.4 | 51.2 | 72.3 | 69.1 | 65.9 | 2.22 |
swem-towards-real-time-video-object-1 | 89.0 | 88.1 | 87.3 | - | - | - | 79.8 | 77.2 | 74.5 | 36.0 |
dmm-net-differentiable-mask-matching-network | - | - | - | - | - | - | 73.3 | 70.7 | 68.1 | - |
pixel-level-bijective-matching-for-video | 81.4 | 82.2 | 82.9 | 64.7 | 62.7 | 60.7 | 74.7 | 72.7 | 70.7 | 45.9 |
fast-video-object-segmentation-via-dynamic | 83.5 | 83.6 | 83.7 | - | - | - | 70.6 | 67.4 | 64.2 | 14.3 |