AOT-T (all frames) | 84.7 | 83.5 | 80.0 | 75.2 | 80.9 | 5.3 | 41.0 | Associating Objects with Transformers for Video Object Segmentation | |
R50-AOTv2-L (all frames) | 90.2 | 87.3 | 85.1 | 78.9 | 85.4 | 15.1 | - | Scalable Video Object Segmentation with Identification Mechanism | |
R50-AOT-L (all frames) | 89.5 | 88.2 | 84.5 | 79.6 | 85.5 | 14.9 | 6.4 | Associating Objects with Transformers for Video Object Segmentation | |
R50-AOST (L'=2) | 88.5 | 87.2 | 83.5 | 78.8 | 84.5 | 13.9 | - | Scalable Video Object Segmentation with Identification Mechanism | |
OSVOS | 60.5 | 60.7 | 59.8 | 54.2 | 58.8 | - | 0.10 | One-Shot Video Object Segmentation | |
AOT-B (all frames) | 88.5 | 86.5 | 83.6 | 78.0 | 84.1 | 8.3 | 20.5 | Associating Objects with Transformers for Video Object Segmentation | |
R50-AOST (L'=3) | 88.8 | 87.9 | 83.8 | 79.3 | 85.0 | 15.4 | - | Scalable Video Object Segmentation with Identification Mechanism | |