Estimating Egocentric 3D Human Pose in Global Space

Egocentric 3D human pose estimation using a single fisheye camera has becomepopular recently as it allows capturing a wide range of daily activities inunconstrained environments, which is difficult for traditional outside-inmotion capture with external cameras. However, existing methods have severallimitations. A prominent problem is that the estimated poses lie in the localcoordinate system of the fisheye camera, rather than in the world coordinatesystem, which is restrictive for many applications. Furthermore, these methodssuffer from limited accuracy and temporal instability due to ambiguities causedby the monocular setup and the severe occlusion in a strongly distortedegocentric perspective. To tackle these limitations, we present a new methodfor egocentric global 3D body pose estimation using a single head-mountedfisheye camera. To achieve accurate and temporally stable global poses, aspatio-temporal optimization is performed over a sequence of frames byminimizing heatmap reprojection errors and enforcing local and global bodymotion priors learned from a mocap dataset. Experimental results show that ourapproach outperforms state-of-the-art methods both quantitatively andqualitatively.