Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization

In this paper, we propose a novel monocular ray-based 3D (Ray3D) absolutehuman pose estimation with calibrated camera. Accurate and generalizableabsolute 3D human pose estimation from monocular 2D pose input is an ill-posedproblem. To address this challenge, we convert the input from pixel space to 3Dnormalized rays. This conversion makes our approach robust to camera intrinsicparameter changes. To deal with the in-the-wild camera extrinsic parametervariations, Ray3D explicitly takes the camera extrinsic parameters as an inputand jointly models the distribution between the 3D pose rays and cameraextrinsic parameters. This novel network design is the key to the outstandinggeneralizability of Ray3D approach. To have a comprehensive understanding ofhow the camera intrinsic and extrinsic parameter variations affect the accuracyof absolute 3D key-point localization, we conduct in-depth systematicexperiments on three single person 3D benchmarks as well as one syntheticbenchmark. These experiments demonstrate that our method significantlyoutperforms existing state-of-the-art models. Our code and the syntheticdataset are available at https://github.com/YxZhxn/Ray3D .