HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Action Recognition In Videos
Action Recognition In Videos On Activitynet
Action Recognition In Videos On Activitynet
Metrics
mAP
Results
Performance results of various models on this benchmark
Columns
Model Name
mAP
Paper Title
Repository
Ada3D
84.0
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
-
VGG19 + 393K webcam images
53.8
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
-
Text4Vis (w/ ViT-L)
96.9
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
InternVideo2-6B
95.9
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VGG19
52.3
Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web
-
SMART
84.4
SMART Frame Selection for Action Recognition
-
CD-UAR
53.8
Towards Universal Representation for Unseen Action Recognition
-
P3D
78.9
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
RRA
83.4
Fine-grained Video Categorization with Redundancy Reduction Attention
-
MARL (w/ SEResNeXt-152)
90.05
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
-
DSN
87.9
Dynamic Sampling Networks for Efficient Action Recognition in Videos
-
NSNet (w/ Swin-L)
94.3
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition
-
ListenToLook
89.9
Listen to Look: Action Recognition by Previewing Audio
DSANet (w/ 3D ResNet50)
90.5
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
BIKE
96.1
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
TSQNet (w/ Swin-L)
93.7
Temporal Saliency Query Network for Efficient Video Recognition
-
0 of 16 row(s) selected.
Previous
Next