HyperAIHyperAI
2 months ago

Action Machine: Rethinking Action Recognition in Trimmed Videos

Zhu, Jiagang ; Zou, Wei ; Xu, Liang ; Hu, Yiming ; Zhu, Zheng ; Chang, Manyu ; Huang, Junjie ; Huang, Guan ; Du, Dalong
Action Machine: Rethinking Action Recognition in Trimmed Videos
Abstract

Existing methods in video action recognition mostly do not distinguish humanbody from the environment and easily overfit the scenes and objects. In thiswork, we present a conceptually simple, general and high-performance frameworkfor action recognition in trimmed videos, aiming at person-centric modeling.The method, called Action Machine, takes as inputs the videos cropped by personbounding boxes. It extends the Inflated 3D ConvNet (I3D) by adding a branch forhuman pose estimation and a 2D CNN for pose-based action recognition, beingfast to train and test. Action Machine can benefit from the multi-task trainingof action recognition and pose estimation, the fusion of predictions from RGBimages and poses. On NTU RGB-D, Action Machine achieves the state-of-the-artperformance with top-1 accuracies of 97.2% and 94.3% on cross-view andcross-subject respectively. Action Machine also achieves competitiveperformance on another three smaller action recognition datasets: NorthwesternUCLA Multiview Action3D, MSR Daily Activity3D and UTD-MHAD. Code will be madeavailable.

Action Machine: Rethinking Action Recognition in Trimmed Videos | Latest Papers | HyperAI