HyperAI

Action Recognition In Videos On Activitynet

المقاييس

mAP

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

اسم النموذج
mAP
Paper TitleRepository
Ada3D84.02D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition-
VGG19 + 393K webcam images53.8Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web-
Text4Vis (w/ ViT-L)96.9Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
InternVideo2-6B95.9InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VGG1952.3Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web-
SMART84.4SMART Frame Selection for Action Recognition-
CD-UAR53.8Towards Universal Representation for Unseen Action Recognition-
P3D78.9Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
RRA83.4Fine-grained Video Categorization with Redundancy Reduction Attention-
MARL (w/ SEResNeXt-152)90.05Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition-
DSN87.9Dynamic Sampling Networks for Efficient Action Recognition in Videos-
NSNet (w/ Swin-L)94.3NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition-
ListenToLook89.9Listen to Look: Action Recognition by Previewing Audio
DSANet (w/ 3D ResNet50)90.5DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
BIKE96.1Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
TSQNet (w/ Swin-L)93.7Temporal Saliency Query Network for Efficient Video Recognition-
0 of 16 row(s) selected.