Learning Trajectory Dependencies for Human Motion Prediction

Human motion prediction, i.e., forecasting future body poses given observedpose sequence, has typically been tackled with recurrent neural networks(RNNs). However, as evidenced by prior work, the resulted RNN models sufferfrom prediction errors accumulation, leading to undesired discontinuities inmotion prediction. In this paper, we propose a simple feed-forward deep networkfor motion prediction, which takes into account both temporal smoothness andspatial dependencies among human body joints. In this context, we then proposeto encode temporal information by working in trajectory space, instead of thetraditionally-used pose space. This alleviates us from manually defining therange of temporal dependencies (or temporal convolutional filter size, as donein previous work). Moreover, spatial dependency of human pose is encoded bytreating a human pose as a generic graph (rather than a human skeletalkinematic tree) formed by links between every pair of body joints. Instead ofusing a pre-defined graph structure, we design a new graph convolutionalnetwork to learn graph connectivity automatically. This allows the network tocapture long range dependencies beyond that of human kinematic tree. Weevaluate our approach on several standard benchmark datasets for motionprediction, including Human3.6M, the CMU motion capture dataset and 3DPW. Ourexperiments clearly demonstrate that the proposed approach achieves state ofthe art performance, and is applicable to both angle-based and position-basedpose representations. The code is available athttps://github.com/wei-mao-2019/LearnTrajDep