Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

We consider a new problem of adapting a human mesh reconstruction model toout-of-domain streaming videos, where performance of existing SMPL-based modelsare significantly affected by the distribution shift represented by differentcamera parameters, bone lengths, backgrounds, and occlusions. We tackle thisproblem through online adaptation, gradually correcting the model bias duringtesting. There are two main challenges: First, the lack of 3D annotationsincreases the training difficulty and results in 3D ambiguities. Second,non-stationary data distribution makes it difficult to strike a balance betweenfitting regular frames and hard samples with severe occlusions or dramaticchanges. To this end, we propose the Dynamic Bilevel Online Adaptationalgorithm (DynaBOA). It first introduces the temporal constraints to compensatefor the unavailable 3D annotations, and leverages a bilevel optimizationprocedure to address the conflicts between multi-objectives. DynaBOA providesadditional 3D guidance by co-training with similar source examples retrievedefficiently despite the distribution shift. Furthermore, it can adaptivelyadjust the number of optimization steps on individual frames to fully fit hardsamples and avoid overfitting regular frames. DynaBOA achieves state-of-the-artresults on three out-of-domain human mesh reconstruction benchmarks.