Visual Object Tracking On Davis 2017
Metrics
F-measure (Mean)
Ju0026F
Jaccard (Mean)
Params(M)
Speed (FPS)
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | F-measure (Mean) | Ju0026F | Jaccard (Mean) | Params(M) | Speed (FPS) |
---|---|---|---|---|---|
decoupling-features-in-hierarchical | 83.8 | 80.8 | 77.8 | 10.2 | 49.2 |
decoupling-features-in-hierarchical | 87.1 | 84.1 | 81.0 | 13.2 | 28.5 |
associating-objects-with-transformers-for | 83.9 | 81.3 | 78.7 | 7.0 | 40.0 |
xmem-long-term-video-object-segmentation-with | 91.0 | 88.2 | 85.4 | - | - |
learning-video-object-segmentation-from-2 | 58.0 | 56.05 | 54.1 | - | - |
associating-objects-with-scalable | 89.5 | 86.7 | 83.8 | 65.6 | 1.3 |
a-transductive-approach-for-video-object | 74.7 | 72.3 | 69.9 | - | - |
reliable-propagation-correction-modulation | 86 | 83.7 | 81.3 | - | - |
associating-objects-with-scalable | 86.1 | 83.7 | 81.2 | 12.5 | 37.4 |
associating-objects-with-transformers-for | 88.4 | 85.4 | 82.4 | 65.4 | 12.1 |
associating-objects-with-scalable | 89.4 | 86.3 | 83.1 | 65.6 | 12.0 |
associating-objects-with-scalable | 89.8 | 87.0 | 84.2 | 65.6 | 1.3 |
joint-inductive-and-transductive-learning-for | 81.2 | 78.6 | 76.0 | - | - |
joint-task-self-supervised-learning-for | 61.3 | 59.5 | 57.7 | - | - |
xmem-long-term-video-object-segmentation-with | 92.6 | 89.5 | 86.3 | - | - |
video-object-segmentation-without-temporal | 71.3 | 68 | 64.7 | - | - |
tracking-anything-with-decoupled-video | 91.0 | 87.6 | 84.2 | - | 25.3 |
spatiotemporal-cnn-for-video-object | 64.6 | 61.65 | 58.7 | - | - |
xmem-long-term-video-object-segmentation-with | 79.3 | 76.7 | 74.1 | - | 22.6 |
self-supervised-video-object-segmentation-by-1 | 71.2 | 69.7 | 68.3 | - | - |
associating-objects-with-transformers-for | 87.5 | 84.9 | 82.3 | 14.9 | 18.0 |
fast-online-object-tracking-and-segmentation | 58.5 | 56.4 | 54.3 | - | - |
190408141 | 78.9 | 76.15 | 73.4 | - | - |
associating-objects-with-transformers-for | 82.3 | 79.9 | 77.4 | 5.7 | 51.4 |
mast-a-memory-augmented-self-supervised | 67.6 | 65.5 | 63.3 | - | - |
collaborative-video-object-segmentation-by | 84.6 | 81.9 | 79.1 | - | - |
fast-and-accurate-online-video-object | 61.8 | 58.2 | 54.6 | - | - |
premvos-proposal-generation-refinement-and | 81.8 | 77.85 | 73.9 | - | - |
swem-towards-real-time-video-object-1 | 79.8 | 77.2 | 74.5 | - | - |
one-shot-video-object-segmentation | 63.9 | 60.25 | 56.6 | - | - |
efficient-video-object-segmentation-via | 57.1 | 54.8 | 52.5 | - | - |
look-before-you-match-instance-understanding | - | 88.6 | 85.8 | - | - |
video-object-segmentation-with-adaptive | 76.1 | 74.6 | 73.0 | - | - |
agss-vos-attention-guided-single-shot-video | 69.8 | 66.6 | 63.4 | - | - |
associating-objects-with-scalable | 88.0 | 85.3 | 82.5 | 13.9 | 24.3 |
separable-structure-modeling-for-semi | 79.9 | 77.6 | 75.3 | - | 22.3 |
feelvos-fast-end-to-end-embedding-learning | 74.0 | 71.55 | 69.1 | - | - |
tarvis-a-unified-approach-for-target-based | 88.5 | 85.3 | 81.7 | - | - |
putting-the-object-back-into-video-object | 90.8 | 88.1 | 85.5 | - | 17.9 |
mobilevos-real-time-video-object-segmentation | 88.9 | 82.3 | - | 8.1 | 90.6 |
putting-the-object-back-into-video-object | 91.1 | 87.9 | 84.6 | 36.4 | - |
a-generative-appearance-model-for-end-to-end | 73.6 | 71.05 | 68.5 | - | - |
xmem-long-term-video-object-segmentation-with | 89.5 | 86.2 | 82.9 | - | 22.6 |
videomatch-matching-based-video-object | 68.2 | 62.4 | 56.5 | - | - |
look-before-you-match-instance-understanding | 93.0 | 89.8 | 86.7 | - | - |
decoupling-features-in-hierarchical | 89.2 | 86.2 | 83.1 | 70.3 | 15.4 |
hierarchical-memory-matching-network-for | 87.5 | 84.7 | 81.9 | - | - |
efficient-regional-memory-network-for-video | 86.0 | 83.5 | 81.0 | - | - |
xmem-long-term-video-object-segmentation-with | 91.4 | 87.7 | 84.0 | - | 22.6 |
online-adaptation-of-convolutional-neural | 69.1 | 65.35 | 61.6 | - | - |
decoupling-features-in-hierarchical | 83.3 | 80.5 | 77.7 | 7.2 | 63.5 |
cnn-in-mrf-video-object-segmentation-via | 74.0 | 70.6 | 67.2 | - | - |
learning-quality-aware-dynamic-memory-for | 88.6 | 85.6 | 82.5 | - | - |
lsmvos-long-short-term-similarity-matching | 80.8 | 77.4 | 73.9 | - | - |
memory-matching-is-not-enough-jointly | 91.0 | 88.1 | 85.2 | - | - |
putting-the-object-back-into-video-object | 93.4 | 90.5 | 87.5 | 17.9 | - |
rvos-end-to-end-recurrent-network-for-video | 63.6 | 60.55 | 57.5 | - | - |
self-supervised-learning-for-video | 52.2 | 50.3 | 48.4 | - | - |
learning-correspondence-from-the-cycle | 50.0 | 48.7 | 46.4 | - | - |
2408-00714 | - | 90.7 | - | 224.4 | - |
modular-interactive-video-object-segmentation | 87.4 | 84.5 | 81.7 | - | 11.2 |
associating-objects-with-transformers-for | 86.4 | 83.8 | 81.1 | 8.3 | 18.7 |
decoupling-features-in-hierarchical | 88.2 | 85.2 | 82.2 | 19.8 | 27.0 |
mobilevos-real-time-video-object-segmentation | 87.1 | 80.2 | - | 8.1 | 90.6 |
region-aware-video-object-segmentation-with | 89.3 | 86.1 | 82.9 | - | 42 (on 3090) |
make-one-shot-video-object-segmentation-1 | 80.0 | 77.2 | 74.4 | - | - |
fast-video-object-segmentation-by-reference | 68.6 | 66.7 | 64.8 | - | - |
associating-objects-with-scalable | 88.5 | 85.6 | 82.6 | 15.4 | 17.5 |
kernelized-memory-network-for-video-object | 85.6 | 82.8 | 80 | - | - |
video-object-segmentation-with-language | 63.5 | - | - | - | - |
rethinking-space-time-networks-with-improved | 88.6 | 85.3 | 82.0 | - | 20.2 |
proposal-tracking-and-segmentation-pts-a | 77.7 | 74.65 | 71.6 | - | - |
associating-objects-with-transformers-for | 85.2 | 82.5 | 79.7 | 8.3 | 29.6 |
dense-unsupervised-learning-for-video | 71.7 | 69.4 | 67.1 | - | - |
xmem-long-term-video-object-segmentation-with | 87.6 | 84.5 | 81.4 | - | 22.6 |
collaborative-video-object-segmentation-by-1 | 85.7 | 82.9 | 80.1 | - | - |
video-object-segmentation-with-language | - | 60.8 | 58.0 | - | - |
video-object-segmentation-using-space-time | 84.3 | 81.75 | 79.2 | - | - |
look-before-you-match-instance-understanding | 91.9 | 88.2 | 84.5 | - | - |
siam-r-cnn-visual-tracking-by-re-detection | 75.0 | 70.55 | 66.1 | - | - |
decoupling-features-in-hierarchical | 85.1 | 82.2 | 79.2 | 13.2 | 40.9 |
ranet-ranking-attention-network-for-fast | 68.2 | 65.7 | 63.2 | - | - |