HyperAIHyperAI
2 months ago

Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement

Wang, Jian ; Cao, Zhe ; Luvizon, Diogo ; Liu, Lingjie ; Sarkar, Kripasindhu ; Tang, Danhang ; Beeler, Thabo ; Theobalt, Christian
Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based
  Motion Refinement
Abstract

In this work, we explore egocentric whole-body motion capture using a singlefisheye camera, which simultaneously estimates human body and hand motion. Thistask presents significant challenges due to three factors: the lack ofhigh-quality datasets, fisheye camera distortion, and human bodyself-occlusion. To address these challenges, we propose a novel approach thatleverages FisheyeViT to extract fisheye image features, which are subsequentlyconverted into pixel-aligned 3D heatmap representations for 3D human body poseprediction. For hand tracking, we incorporate dedicated hand detection and handpose estimation networks for regressing 3D hand poses. Finally, we develop adiffusion-based whole-body motion prior model to refine the estimatedwhole-body motion while accounting for joint uncertainties. To train thesenetworks, we collect a large synthetic dataset, EgoWholeBody, comprising840,000 high-quality egocentric images captured across a diverse range ofwhole-body motion sequences. Quantitative and qualitative evaluationsdemonstrate the effectiveness of our method in producing high-qualitywhole-body motion estimates from a single egocentric camera.