HyperAI

Semi Supervised Video Object Segmentation On 20

Metrics

D16 val (F)
D16 val (G)
D16 val (J)
D17 test (F)
D17 test (G)
D17 test (J)
D17 val (F)
D17 val (G)
D17 val (J)
FPS

Results

Performance results of various models on this benchmark

Comparison Table
Model NameD16 val (F)D16 val (G)D16 val (J)D17 test (F)D17 test (G)D17 test (J)D17 val (F)D17 val (G)D17 val (J)FPS
ranet-ranking-attention-network-for-fast85.485.585.557.255.353.468.265.763.230.3
fast-video-object-segmentation-using-the85.786.687.6---73.571.469.325.0
agss-vos-attention-guided-single-shot-video---59.757.254.869.967.464.910.0
video-object-segmentation-using-space-time88.186.584.8---74.071.669.26.25
learning-position-and-target-consistency-for------77.275.273.18.47
Model 686.486.185.8-55.2-71.668.565.30.92
efficient-regional-memory-network-for-video82.381.580.6---77.275.072.811.9
tackling-background-distraction-in-video86.286.887.572.269.466.682.380.077.650.1
learning-what-to-learn-for-video-object------76.374.372.214.0
sstvos-sparse-spatiotemporal-transformers-for------81.478.475.4-
kernelized-memory-network-for-video-object88.187.687.1---77.876.074.28.33
spatiotemporal-cnn-for-video-object83.883.883.8---64.661.758.70.26
a-transductive-approach-for-video-object---67.463.158.874.772.369.937.0
joint-inductive-and-transductive-learning-for------81.278.676.04.00
spatiotemporal-graph-neural-network-based86.085.785.466.563.159.777.974.771.5-
video-object-segmentation-with-adaptive------76.174.673.04.00
hierarchical-memory-matching-network-for90.689.488.2---83.180.477.710.0
xmem-long-term-video-object-segmentation-with---------29.6
associating-objects-with-transformers-for------82.079.276.440.0
collaborative-video-object-segmentation-by86.986.185.3---77.774.972.15.56
learning-fast-and-robust-target-models-for-81.7----71.268.866.421.9
feelvos-fast-end-to-end-embedding-learning83.181.780.357.554.451.272.369.165.92.22
swem-towards-real-time-video-object-189.088.187.3---79.877.274.536.0
dmm-net-differentiable-mask-matching-network------73.370.768.1-
pixel-level-bijective-matching-for-video81.482.282.964.762.760.774.772.770.745.9
fast-video-object-segmentation-via-dynamic83.583.683.7---70.667.464.214.3