HyperAIHyperAI
2 months ago

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

Liu, Jun ; Shahroudy, Amir ; Perez, Mauricio ; Wang, Gang ; Duan, Ling-Yu ; Kot, Alex C.
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity
  Understanding
Abstract

Research on depth-based human activity analysis achieved outstandingperformance and demonstrated the effectiveness of 3D representation for actionrecognition. The existing depth-based and RGB+D-based action recognitionbenchmarks have a number of limitations, including the lack of large-scaletraining samples, realistic number of distinct class categories, diversity incamera views, varied environmental conditions, and variety of human subjects.In this work, we introduce a large-scale dataset for RGB+D human actionrecognition, which is collected from 106 distinct subjects and contains morethan 114 thousand video samples and 8 million frames. This dataset contains 120different action classes including daily, mutual, and health-relatedactivities. We evaluate the performance of a series of existing 3D activityanalysis methods on this dataset, and show the advantage of applying deeplearning methods for 3D-based human action recognition. Furthermore, weinvestigate a novel one-shot 3D activity recognition problem on our dataset,and a simple yet effective Action-Part Semantic Relevance-aware (APSR)framework is proposed for this task, which yields promising results forrecognition of the novel action classes. We believe the introduction of thislarge-scale dataset will enable the community to apply, adapt, and developvarious data-hungry learning techniques for depth-based and RGB+D-based humanactivity understanding. [The dataset is available at:http://rose1.ntu.edu.sg/Datasets/actionRecognition.asp]