HyperAI超神经

Action Recognition In Videos On Activitynet

评估指标

mAP

评测结果

各个模型在此基准测试上的表现结果

模型名称
mAP
Paper TitleRepository
Ada3D84.02D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition-
VGG19 + 393K webcam images53.8Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web-
Text4Vis (w/ ViT-L)96.9Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
InternVideo2-6B95.9InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VGG1952.3Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web-
SMART84.4SMART Frame Selection for Action Recognition-
CD-UAR53.8Towards Universal Representation for Unseen Action Recognition-
P3D78.9Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
RRA83.4Fine-grained Video Categorization with Redundancy Reduction Attention-
MARL (w/ SEResNeXt-152)90.05Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition-
DSN87.9Dynamic Sampling Networks for Efficient Action Recognition in Videos-
NSNet (w/ Swin-L)94.3NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition-
ListenToLook89.9Listen to Look: Action Recognition by Previewing Audio
DSANet (w/ 3D ResNet50)90.5DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
BIKE96.1Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
TSQNet (w/ Swin-L)93.7Temporal Saliency Query Network for Efficient Video Recognition-
0 of 16 row(s) selected.