HyperAIHyperAI

Command Palette

Search for a command to run...

MotionBERT: A Unified Perspective on Learning Human Motion Representations

Wentao Zhu Xiaoxuan Ma Zhaoyang Liu Libin Liu Wayne Wu Yizhou Wang

Abstract

We present a unified perspective on tackling various human-centric videotasks by learning human motion representations from large-scale andheterogeneous data resources. Specifically, we propose a pretraining stage inwhich a motion encoder is trained to recover the underlying 3D motion fromnoisy partial 2D observations. The motion representations acquired in this wayincorporate geometric, kinematic, and physical knowledge about human motion,which can be easily transferred to multiple downstream tasks. We implement themotion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer)neural network. It could capture long-range spatio-temporal relationships amongthe skeletal joints comprehensively and adaptively, exemplified by the lowest3D pose estimation error so far when trained from scratch. Furthermore, ourproposed framework achieves state-of-the-art performance on all threedownstream tasks by simply finetuning the pretrained motion encoder with asimple regression head (1-2 layers), which demonstrates the versatility of thelearned motion representations. Code and models are available athttps://motionbert.github.io/


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp