HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Action Recognition In Videos
Action Recognition In Videos On Activitynet
Action Recognition In Videos On Activitynet
평가 지표
mAP
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
mAP
Paper Title
Repository
Ada3D
84.0
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
-
VGG19 + 393K webcam images
53.8
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
-
Text4Vis (w/ ViT-L)
96.9
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
InternVideo2-6B
95.9
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VGG19
52.3
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
-
SMART
84.4
SMART Frame Selection for Action Recognition
-
CD-UAR
53.8
Towards Universal Representation for Unseen Action Recognition
-
P3D
78.9
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
RRA
83.4
Fine-grained Video Categorization with Redundancy Reduction Attention
-
MARL (w/ SEResNeXt-152)
90.05
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
-
DSN
87.9
Dynamic Sampling Networks for Efficient Action Recognition in Videos
-
NSNet (w/ Swin-L)
94.3
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
-
ListenToLook
89.9
Listen to Look: Action Recognition by Previewing Audio
DSANet (w/ 3D ResNet50)
90.5
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
BIKE
96.1
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
TSQNet (w/ Swin-L)
93.7
Temporal Saliency Query Network for Efficient Video Recognition
-
0 of 16 row(s) selected.
Previous
Next