HyperAIHyperAI
2 months ago

UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

Jiang, Zhongyu ; Chai, Wenhao ; Li, Lei ; Zhou, Zhuoran ; Yang, Cheng-Yen ; Hwang, Jenq-Neng
UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning
Abstract

In recent times, there has been a growing interest in developing effectiveperception techniques for combining information from multiple modalities. Thisinvolves aligning features obtained from diverse sources to enable moreefficient training with larger datasets and constraints, as well as leveragingthe wealth of information contained in each modality. 2D and 3D Human PoseEstimation (HPE) are two critical perceptual tasks in computer vision, whichhave numerous downstream applications, such as Action Recognition,Human-Computer Interaction, Object tracking, etc. Yet, there are limitedinstances where the correlation between Image and 2D/3D human pose has beenclearly researched using a contrastive paradigm. In this paper, we proposeUniHPE, a unified Human Pose Estimation pipeline, which aligns features fromall three modalities, i.e., 2D human pose estimation, lifting-based andimage-based 3D human pose estimation, in the same pipeline. To align more thantwo modalities at the same time, we propose a novel singular value basedcontrastive learning loss, which better aligns different modalities and furtherboosts the performance. In our evaluation, UniHPE achieves remarkableperformance metrics: MPJPE $50.5$mm on the Human3.6M dataset and PAMPJPE$51.6$mm on the 3DPW dataset. Our proposed method holds immense potential toadvance the field of computer vision and contribute to various applications.