HyperAIHyperAI
2 months ago

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Lin, Jing ; Zeng, Ailing ; Wang, Haoqian ; Zhang, Lei ; Li, Yu
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
Abstract

Whole-body mesh recovery aims to estimate the 3D human body, face, and handsparameters from a single image. It is challenging to perform this task with asingle network due to resolution issues, i.e., the face and hands are usuallylocated in extremely small regions. Existing works usually detect hands andfaces, enlarge their resolution to feed in a specific network to predict theparameter, and finally fuse the results. While this copy-paste pipeline cancapture the fine-grained details of the face and hands, the connections betweendifferent parts cannot be easily recovered in late fusion, leading toimplausible 3D rotation and unnatural pose. In this work, we propose aone-stage pipeline for expressive whole-body mesh recovery, named OSX, withoutseparate networks for each part. Specifically, we design a Component AwareTransformer (CAT) composed of a global body encoder and a local face/handdecoder. The encoder predicts the body parameters and provides a high-qualityfeature map for the decoder, which performs a feature-level upsample-cropscheme to extract high-resolution part-specific features and adoptkeypoint-guided deformable attention to estimate hand and face precisely. Thewhole pipeline is simple yet effective without any manual post-processing andnaturally avoids implausible prediction. Comprehensive experiments demonstratethe effectiveness of OSX. Lastly, we build a large-scale Upper-Body dataset(UBody) with high-quality 2D and 3D whole-body annotations. It contains personswith partially visible bodies in diverse real-life scenarios to bridge the gapbetween the basic task and downstream applications.

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer | Latest Papers | HyperAI