HyperAI

Self Supervised Action Recognition On Ucf101

Metriken

3-fold Accuracy
Frozen
Pre-Training Dataset

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
Modellname3-fold AccuracyFrozenPre-Training Dataset
self-supervised-spatiotemporal-feature62.9falseKinetics400
video-representation-learning-by-dense60.6falseUCF101
xkd-cross-modal-knowledge-distillation-with93.4--
self-supervised-video-representation-learning-790.5falseUCF101
spatiotemporal-contrastive-video92.2falseKinetics400
self-supervised-video-representation-learning65.8falseKinetics400
audio-visual-instance-discrimination-with91.0falseAudioset (Audio+Video)
spatiotemporal-contrastive-video93.4falseKinetics600
spatiotemporal-contrastive-video93.9falseKinetics600
generating-videos-with-scene-dynamics52.1falseUCF101
audio-visual-instance-discrimination-with87.5falseKinetics400 (Audio+Video)
rspnet-relative-speed-perception-for93.7falseKinetics400
self-supervised-video-representation-learning-772.2trueUCF101
videomae-masked-autoencoders-are-data-191.3falseno extra data
audio-visual-instance-discrimination-with91.5falseAudioset (Audio+Video)
masked-video-distillation-rethinking-masked97.5falseKinetics400
self-supervised-audio-visual-representation91.5falseKinetics400
self-supervised-audio-visual-representation88.3falseKinetics-Sound
self-supervised-video-representation-learning-788.8falseUCF101
self-supervised-video-representation-learning-782.8falseUCF101
skip-clip-self-supervised-spatiotemporal64.4falseUCF101
temporally-coherent-embeddings-for-self68.2falseUCF101
videomae-v2-scaling-video-masked-autoencoders99.6--
video-cloze-procedure-for-self-supervised66falseUCF101
temporally-coherent-embeddings-for-self68.8falseKinetics400
contrastive-multiview-coding59.1falseUCF101
broaden-your-views-for-self-supervised-video93.1false-
a-large-scale-study-on-unsupervised96.3falseKinetics400
tclr-temporal-contrastive-learning-for-video82.4falseUCF101
self-supervised-video-representation-learning-885.4--
videomae-masked-autoencoders-are-data-196.1falseKinetics400
self-supervised-video-representation-using82.3falseUCF101
self-supervised-spatio-temporal58.8falseUCF101
video-representation-learning-by-dense68.2falseKinetics400
self-supervised-video-representation-learning-788.8falseUCF101
similarity-contrastive-estimation-for-image95.3falseKinetics400
benchmarking-self-supervised-video97.3falseKinetics400
xkd-cross-modal-knowledge-distillation-with94.1-Kinetics400
video-representation-learning-by-dense75.7falseKinetics400
slic-self-supervised-learning-with-iterative-1-falseUCF101
self-supervised-audio-visual-representation92.4falseAudioSet
m-3-video-masked-motion-modeling-for-self96.5falseKinetics400
self-supervised-video-representation-learning-884.8--
learning-and-using-the-arrow-of-time55.3falseUCF101
audio-visual-instance-discrimination-with86.9falseKinetics400 (Audio+Video)
efficient-video-representation-learning-via93.4falseno extra data
shuffle-and-learn-unsupervised-learning-using50.9falseUCF101
self-supervised-spatiotemporal-learning-via64.9falseUCF101
temporally-coherent-embeddings-for-self71.2falseKinetics400
self-supervised-multimodal-versatile-networks95.2falseAudioset + Howto100M
self-supervised-video-representation-learning-160.3falseUCF101
self-supervised-co-training-for-video74.5false-
self-supervised-video-representation-learning-374.4falseUCF101