HyperAIHyperAI

Temporal Action Localization On Activitynet

Results

Performance results of various models on this benchmark

Model Name
mAP
Paper TitleRepository
AdaTAD (VideoMAEv2-giant)41.9361.7243.3510.85End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames-
BSN++34.8851.2735.708.33BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation-
PRN (CSN)39.457.9--Proposal Relation Network for Temporal Action Detection-
TSP35.8151.2637.129.29TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks-
GCM34.2451.0335.177.44Graph Convolutional Module for Temporal Action Localization in Videos-
BSN30.0346.4529.968.02BSN: Boundary Sensitive Network for Temporal Action Proposal Generation-
TAGS (I3D)36.5---Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning-
TadTR (TSP features)36.7553.6237.5210.56End-to-end Temporal Action Detection with Transformer-
SSN32.2639.12--A Pursuit of Temporal Accuracy in General Activity Detection-
BC-GNN34.2650.5634.759.37Boundary Content Graph Neural Network for Temporal Action Proposal Generation-
E2E-TAD (SlowFast R50+TadTR)35.1050.4735.9910.83An Empirical Study of End-to-End Temporal Action Detection-
InternVideo2-6B41.2---InternVideo2: Scaling Foundation Models for Multimodal Video Understanding-
HCN(I3D features)35.6152.5136.107.12Improve Temporal Action Proposals using Hierarchical Context-
LoFi+G-TAD34.9650.9135.868.79Low-Fidelity Video Encoder Optimization for Temporal Action Localization-
UniMD+Sync.39.8360.29--UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection-
RDFA-S6 (InternVideo2-6B)42.964.144.010.6Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism-
ActionMamba (InternVideo2-6B)42.0262.4343.4910.23Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding-
AVFusion36.8254.3437.668.93Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization-
InternVideo39.00---InternVideo: General Video Foundation Models via Generative and Discriminative Learning-
VSGN (TSP features)35.9453.2636.768.12Video Self-Stitching Graph Network for Temporal Action Localization-
0 of 33 row(s) selected.
Temporal Action Localization On Activitynet | SOTA | HyperAI