HyperAIHyperAI

Command Palette

Search for a command to run...

Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens

Sen Yang Wen Heng Gang Liu Guozhong Luo Wankou Yang Gang Yu

Abstract

In this paper we present a novel method to estimate 3D human pose and shapefrom monocular videos. This task requires directly recovering pixel-alignment3D human pose and body shape from monocular images or videos, which ischallenging due to its inherent ambiguity. To improve precision, existingmethods highly rely on the initialized mean pose and shape as prior estimatesand parameter regression with an iterative error feedback manner. In addition,video-based approaches model the overall change over the image-level featuresto temporally enhance the single-frame feature, but fail to capture therotational motion at the joint level, and cannot guarantee local temporalconsistency. To address these issues, we propose a novel Transformer-basedmodel with a design of independent tokens. First, we introduce three types oftokens independent of the image feature: \textit{joint rotation tokens, shapetoken, and camera token}. By progressively interacting with image featuresthrough Transformer layers, these tokens learn to encode the prior knowledge ofhuman 3D joint rotations, body shape, and position information from large-scaledata, and are updated to estimate SMPL parameters conditioned on a given image.Second, benefiting from the proposed token-based representation, we further usea temporal model to focus on capturing the rotational temporal information ofeach joint, which is empirically conducive to preventing large jitters in localparts. Despite being conceptually simple, the proposed method attains superiorperformances on the 3DPW and Human3.6M datasets. Using ResNet-50 andTransformer architectures, it obtains 42.0 mm error on the PA-MPJPE metric ofthe challenging 3DPW, outperforming state-of-the-art counterparts by a largemargin. Code will be publicly available athttps://github.com/yangsenius/INT_HMR_Model


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens | Papers | HyperAI