Self Supervised Action Recognition On Ucf101
المقاييس
3-fold Accuracy
Frozen
Pre-Training Dataset
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
جدول المقارنة
اسم النموذج | 3-fold Accuracy | Frozen | Pre-Training Dataset |
---|---|---|---|
self-supervised-spatiotemporal-feature | 62.9 | false | Kinetics400 |
video-representation-learning-by-dense | 60.6 | false | UCF101 |
xkd-cross-modal-knowledge-distillation-with | 93.4 | - | - |
self-supervised-video-representation-learning-7 | 90.5 | false | UCF101 |
spatiotemporal-contrastive-video | 92.2 | false | Kinetics400 |
self-supervised-video-representation-learning | 65.8 | false | Kinetics400 |
audio-visual-instance-discrimination-with | 91.0 | false | Audioset (Audio+Video) |
spatiotemporal-contrastive-video | 93.4 | false | Kinetics600 |
spatiotemporal-contrastive-video | 93.9 | false | Kinetics600 |
generating-videos-with-scene-dynamics | 52.1 | false | UCF101 |
audio-visual-instance-discrimination-with | 87.5 | false | Kinetics400 (Audio+Video) |
rspnet-relative-speed-perception-for | 93.7 | false | Kinetics400 |
self-supervised-video-representation-learning-7 | 72.2 | true | UCF101 |
videomae-masked-autoencoders-are-data-1 | 91.3 | false | no extra data |
audio-visual-instance-discrimination-with | 91.5 | false | Audioset (Audio+Video) |
masked-video-distillation-rethinking-masked | 97.5 | false | Kinetics400 |
self-supervised-audio-visual-representation | 91.5 | false | Kinetics400 |
self-supervised-audio-visual-representation | 88.3 | false | Kinetics-Sound |
self-supervised-video-representation-learning-7 | 88.8 | false | UCF101 |
self-supervised-video-representation-learning-7 | 82.8 | false | UCF101 |
skip-clip-self-supervised-spatiotemporal | 64.4 | false | UCF101 |
temporally-coherent-embeddings-for-self | 68.2 | false | UCF101 |
videomae-v2-scaling-video-masked-autoencoders | 99.6 | - | - |
video-cloze-procedure-for-self-supervised | 66 | false | UCF101 |
temporally-coherent-embeddings-for-self | 68.8 | false | Kinetics400 |
contrastive-multiview-coding | 59.1 | false | UCF101 |
broaden-your-views-for-self-supervised-video | 93.1 | false | - |
a-large-scale-study-on-unsupervised | 96.3 | false | Kinetics400 |
tclr-temporal-contrastive-learning-for-video | 82.4 | false | UCF101 |
self-supervised-video-representation-learning-8 | 85.4 | - | - |
videomae-masked-autoencoders-are-data-1 | 96.1 | false | Kinetics400 |
self-supervised-video-representation-using | 82.3 | false | UCF101 |
self-supervised-spatio-temporal | 58.8 | false | UCF101 |
video-representation-learning-by-dense | 68.2 | false | Kinetics400 |
self-supervised-video-representation-learning-7 | 88.8 | false | UCF101 |
similarity-contrastive-estimation-for-image | 95.3 | false | Kinetics400 |
benchmarking-self-supervised-video | 97.3 | false | Kinetics400 |
xkd-cross-modal-knowledge-distillation-with | 94.1 | - | Kinetics400 |
video-representation-learning-by-dense | 75.7 | false | Kinetics400 |
slic-self-supervised-learning-with-iterative-1 | - | false | UCF101 |
self-supervised-audio-visual-representation | 92.4 | false | AudioSet |
m-3-video-masked-motion-modeling-for-self | 96.5 | false | Kinetics400 |
self-supervised-video-representation-learning-8 | 84.8 | - | - |
learning-and-using-the-arrow-of-time | 55.3 | false | UCF101 |
audio-visual-instance-discrimination-with | 86.9 | false | Kinetics400 (Audio+Video) |
efficient-video-representation-learning-via | 93.4 | false | no extra data |
shuffle-and-learn-unsupervised-learning-using | 50.9 | false | UCF101 |
self-supervised-spatiotemporal-learning-via | 64.9 | false | UCF101 |
temporally-coherent-embeddings-for-self | 71.2 | false | Kinetics400 |
self-supervised-multimodal-versatile-networks | 95.2 | false | Audioset + Howto100M |
self-supervised-video-representation-learning-1 | 60.3 | false | UCF101 |
self-supervised-co-training-for-video | 74.5 | false | - |
self-supervised-video-representation-learning-3 | 74.4 | false | UCF101 |