HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
서비스 약관
개인정보 처리방침
한국어
HyperAI
HyperAI초신경
Toggle Sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
플랫폼
홈
SOTA
자기지도 행동인식
Self Supervised Action Recognition On Hmdb51
Self Supervised Action Recognition On Hmdb51
평가 지표
Frozen
Pre-Training Dataset
Top-1 Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Frozen
Pre-Training Dataset
Top-1 Accuracy
Paper Title
MVD (ViT-B)
false
Kinetics400
79.7
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
M3Video
false
Kinetics400
78.0
Masked Motion Encoding for Self-Supervised Video Representation Learning
pBYOL
false
Kinetics400
75.0
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
SCE (R3D-50)
false
Kinetics400
74.7
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
VideoMAE
false
Kinetics400
73.3
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
BraVe:V-FA (TSM-50x2)
false
-
70.5
Broaden Your Views for Self-Supervised Video Learning
CVRL (R3D-152 2x; K600)
false
Kinetics600
69.9
Spatiotemporal Contrastive Video Representation Learning
XKD (ViT-B/112/16)
-
-
69
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
XDC
false
IG-Kinetics
68.9
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
CVRL (R3D-50; K600)
false
Kinetics600
68.0
Spatiotemporal Contrastive Video Representation Learning
CrissCross (AudioSet)
false
AudioSet
66.8
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
CVRL (R3D-50; K400)
false
Kinetics400
66.7
Spatiotemporal Contrastive Video Representation Learning
XDC
false
IG-Random
66.5
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
XKD-Modality-Agnostic (ViT-B/112/16)
-
-
65.9
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
VideoMS (ViT-B)
false
no extra data
65.8
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
RSPNet
false
Kinetics400
64.7
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
CrissCross (Kinetics400)
false
Kinetics400
64.7
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
AVID+CMA (Modified R2+1D-18 on Audioset)
false
Audioset (Video+Audio)
64.7
Audio-Visual Instance Discrimination with Cross-Modal Agreement
ELo
false
-
64.5
Evolving Losses for Unsupervised Video Representation Learning
AVID (Modified R2+1D-18 on Audioset)
false
Audioset (Video+Audio)
64.1
Audio-Visual Instance Discrimination with Cross-Modal Agreement
0 of 48 row(s) selected.
Previous
Next
Self Supervised Action Recognition On Hmdb51 | SOTA | HyperAI초신경