ARCH: Animatable Reconstruction of Clothed Humans

In this paper, we propose ARCH (Animatable Reconstruction of Clothed Humans),a novel end-to-end framework for accurate reconstruction of animation-ready 3Dclothed humans from a monocular image. Existing approaches to digitize 3Dhumans struggle to handle pose variations and recover details. Also, they donot produce models that are animation ready. In contrast, ARCH is a learnedpose-aware model that produces detailed 3D rigged full-body human avatars froma single unconstrained RGB image. A Semantic Space and a Semantic DeformationField are created using a parametric 3D body estimator. They allow thetransformation of 2D/3D clothed humans into a canonical space, reducingambiguities in geometry caused by pose variations and occlusions in trainingdata. Detailed surface geometry and appearance are learned using an implicitfunction representation with spatial local features. Furthermore, we proposeadditional per-pixel supervision on the 3D reconstruction using opacity-awaredifferentiable rendering. Our experiments indicate that ARCH increases thefidelity of the reconstructed humans. We obtain more than 50% lowerreconstruction errors for standard metrics compared to state-of-the-art methodson public datasets. We also show numerous qualitative examples of animated,high-quality reconstructed avatars unseen in the literature so far.