MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation

Heatmap representations have formed the basis of human pose estimationsystems for many years, and their extension to 3D has been a fruitful line ofrecent research. This includes 2.5D volumetric heatmaps, whose X and Y axescorrespond to image space and Z to metric depth around the subject. To obtainmetric-scale predictions, 2.5D methods need a separate post-processing step toresolve scale ambiguity. Further, they cannot localize body joints outside theimage boundaries, leading to incomplete estimates for truncated images. Toaddress these limitations, we propose metric-scale truncation-robust (MeTRo)volumetric heatmaps, whose dimensions are all defined in metric 3D space,instead of being aligned with image space. This reinterpretation of heatmapdimensions allows us to directly estimate complete, metric-scale poses withouttest-time knowledge of distance or relying on anthropometric heuristics, suchas bone lengths. To further demonstrate the utility our representation, wepresent a differentiable combination of our 3D metric-scale heatmaps with 2Dimage-space ones to estimate absolute 3D pose (our MeTRAbs architecture). Wefind that supervision via absolute pose loss is crucial for accuratenon-root-relative localization. Using a ResNet-50 backbone without furtherlearned layers, we obtain state-of-the-art results on Human3.6M, MPI-INF-3DHPand MuPoTS-3D. Our code will be made publicly available to facilitate furtherresearch.