Action Recognition
Temporal Action Localization is a sub-task in the field of computer vision that aims to detect activities within video streams and output their start and end timestamps. This task provides critical support for applications such as video analysis, surveillance, and content retrieval by accurately pinpointing when actions occur in a video. Closely related to Temporal Action Proposal Generation, it can effectively enhance the accuracy and efficiency of video understanding.
ActivityNet-1.2
DeepMetricLearner
ActivityNet-1.3
AVFusion
CrossTask
VideoCLIP
Ego4D MQ test
ActionFormer (SlowFast+Omnivore+EgoVLP)
Ego4D MQ val
EPIC-KITCHENS-100
AdaTAD (verb, VideoMAE-L)
FineAction
VideoMAE V2-g
HACS
RDFA-S6 (InternVideo2-6B)
MEXaction2
S-CNN
MultiTHUMOS
TriDet (VideoMAEv2)
MUSES
TemporalMaxer
THUMOS'14
AVFusion
THUMOS’14
ActionFormer (VideoMAE V2-g features)
THUMOS14
BasicTAD (R50-SlowOnly)