HyperAI초신경

홈 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Action Recognition In Videos On Activitynet

평가 지표

mAP

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	mAP	Paper Title	Repository
Ada3D	84.0	2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition	-
VGG19 + 393K webcam images	53.8	Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web	-
Text4Vis (w/ ViT-L)	96.9	Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
InternVideo2-6B	95.9	InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VGG19	52.3	Do Less and Achieve More: Training CNNs for Action Recognition Utilizing Action Images from the Web	-
SMART	84.4	SMART Frame Selection for Action Recognition	-
CD-UAR	53.8	Towards Universal Representation for Unseen Action Recognition	-
P3D	78.9	Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
RRA	83.4	Fine-grained Video Categorization with Redundancy Reduction Attention	-
MARL (w/ SEResNeXt-152)	90.05	Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition	-
DSN	87.9	Dynamic Sampling Networks for Efficient Action Recognition in Videos	-
NSNet (w/ Swin-L)	94.3	NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition	-
ListenToLook	89.9	Listen to Look: Action Recognition by Previewing Audio
DSANet (w/ 3D ResNet50)	90.5	DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
BIKE	96.1	Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
TSQNet (w/ Swin-L)	93.7	Temporal Saliency Query Network for Efficient Video Recognition	-

0 of 16 row(s) selected.