Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild

Recovering 3D human mesh in the wild is greatly challenging as in-the-wild(ITW) datasets provide only 2D pose ground truths (GTs). Recently, 3Dpseudo-GTs have been widely used to train 3D human mesh estimation networks asthe 3D pseudo-GTs enable 3D mesh supervision when training the networks on ITWdatasets. However, despite the great potential of the 3D pseudo-GTs, there hasbeen no extensive analysis that investigates which factors are important tomake more beneficial 3D pseudo-GTs. In this paper, we provide three recipes toobtain highly beneficial 3D pseudo-GTs of ITW datasets. The main challenge isthat only 2D-based weak supervision is allowed when obtaining the 3Dpseudo-GTs. Each of our three recipes addresses the challenge in each aspect:depth ambiguity, sub-optimality of weak supervision, and implausiblearticulation. Experimental results show that simply re-trainingstate-of-the-art networks with our new 3D pseudo-GTs elevates their performanceto the next level without bells and whistles. The 3D pseudo-GT is publiclyavailable in https://github.com/mks0601/NeuralAnnot_RELEASE.