HyperAIHyperAI
2 months ago

Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos

Shahroudy, Amir ; Ng, Tian-Tsong ; Gong, Yihong ; Wang, Gang
Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos
Abstract

Single modality action recognition on RGB or depth sequences has beenextensively explored recently. It is generally accepted that each of these twomodalities has different strengths and limitations for the task of actionrecognition. Therefore, analysis of the RGB+D videos can help us to betterstudy the complementary properties of these two types of modalities and achievehigher levels of performance. In this paper, we propose a new deep autoencoderbased shared-specific feature factorization network to separate inputmultimodal signals into a hierarchy of components. Further, based on thestructure of the features, a structured sparsity learning machine is proposedwhich utilizes mixed norms to apply regularization within components and groupselection between them for better classification performance. Our experimentalresults show the effectiveness of our cross-modality feature analysis frameworkby achieving state-of-the-art accuracy for action classification on fivechallenging benchmark datasets.