HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Action Recognition In Videos
Action Recognition On Diving 48
Action Recognition On Diving 48
평가 지표
Accuracy
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Accuracy
Paper Title
Repository
ORViT TimeSformer
88.0
Object-Region Video Transformers
VIMPAC
85.5
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
SlowFast
77.6
SlowFast Networks for Video Recognition
LVMAE
94.9
Extending Video Masked Autoencoders to 128 frames
-
StructVit-B-4-1
88.3
Learning Correlation Structures for Vision Transformers
-
TimeSformer
75
Is Space-Time Attention All You Need for Video Understanding?
DUALPATH
88.7
Dual-path Adaptation from Image to Video Transformers
TimeSformer-HR
78
Is Space-Time Attention All You Need for Video Understanding?
TFCNet
88.3
TFCNet: Temporal Fully Connected Networks for Static Unbiased Temporal Reasoning
-
Video-FocalNet-B
90.8
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
AIM (CLIP ViT-L/14, 32x224)
90.6
AIM: Adapting Image Models for Efficient Video Action Recognition
RSANet-R50 (16 frames, ImageNet pretrained, a single clip)
84.2
Relational Self-Attention: What's Missing in Attention for Video Understanding
GC-TDN
87.6
Group Contextualization for Video Recognition
BEVT
86.7
BEVT: BERT Pretraining of Video Transformers
PMI Sampler
81.3
PMI Sampler: Patch Similarity Guided Frame Selection for Aerial Action Recognition
TQN
81.8
Temporal Query Networks for Fine-grained Video Understanding
-
TimeSformer-L
81
Is Space-Time Attention All You Need for Video Understanding?
PSB
86
Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition
0 of 18 row(s) selected.
Previous
Next