HyperAI초신경

Self Supervised Action Recognition On Hmdb51

평가 지표

Frozen
Pre-Training Dataset
Top-1 Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Frozen
Pre-Training Dataset
Top-1 Accuracy
Paper TitleRepository
RSPNetfalseKinetics40064.7RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
VideoMAEfalseKinetics40073.3VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training-
PCL (ResNet-18)falseUCF10143.2Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning
XDCfalseIG-Random66.5Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Motion & Appearance (C3D)falseUCF10120.3Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
XDCfalseKinetics40052.6Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Video Clip Ordering (R3D)falseUCF10129.5Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction-
XDCfalseAudioSet63.7Self-Supervised Learning by Cross-Modal Audio-Video Clustering
VideoMS (ViT-B)falseno extra data65.8EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
ViCC (R2+1D; RGB)falseUCF10152.4Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
CoCLRfalse-46.1Self-supervised Co-training for Video Representation Learning-
SLIC (R3D-18)falseUCF10154.5SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos
3D Cubic Puzzles (3D ResNet-18)falseKinetics40033.7Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles-
XDCfalseIG-Kinetics68.9Self-Supervised Learning by Cross-Modal Audio-Video Clustering
3D RotNet (3D ResNet-18)falseKinetics40033.7Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction-
CVRL (R3D-152 2x; K600)falseKinetics60069.9Spatiotemporal Contrastive Video Representation Learning
BraVe:V-FA (TSM-50x2)false-70.5Broaden Your Views for Self-Supervised Video Learning
XKD-Modality-Agnostic (ViT-B/112/16)--65.9XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Shuffle and Learn (AlexNet)falseUCF10119.8Shuffle and Learn: Unsupervised Learning using Temporal Order Verification-
DPC (Modified 3D ResNet-18)falseKinetics40034.5Video Representation Learning by Dense Predictive Coding
0 of 48 row(s) selected.