HyperAIHyperAI
2 months ago

BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Hong, Seong-Eun ; Lim, Soobin ; Hwang, Juyeong ; Chang, Minwook ; Kang, Hyeongyeop
BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion
  Synthesis
Abstract

Generating natural and expressive human motions from textual descriptions ischallenging due to the complexity of coordinating full-body dynamics andcapturing nuanced motion patterns over extended sequences that accuratelyreflect the given text. To address this, we introduce BiPO, BidirectionalPartial Occlusion Network for Text-to-Motion Synthesis, a novel model thatenhances text-to-motion synthesis by integrating part-based generation with abidirectional autoregressive architecture. This integration allows BiPO toconsider both past and future contexts during generation while enhancingdetailed control over individual body parts without requiring ground-truthmotion length. To relax the interdependency among body parts caused by theintegration, we devise the Partial Occlusion technique, which probabilisticallyoccludes the certain motion part information during training. In ourcomprehensive experiments, BiPO achieves state-of-the-art performance on theHumanML3D dataset, outperforming recent methods such as ParCo, MoMask, and BAMMin terms of FID scores and overall motion quality. Notably, BiPO excels notonly in the text-to-motion generation task but also in motion editing tasksthat synthesize motion based on partially generated motion sequences andtextual descriptions. These results reveal the BiPO's effectiveness inadvancing text-to-motion synthesis and its potential for practicalapplications.

BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis | Latest Papers | HyperAI