8 months ago

Object Tracking

Multi-Task Learning

Method/Architecture

Computer Vision

Jiashun Wang Huazhe Xu Medhini Narasimhan Xiaolong Wang

Abstract

We propose a novel framework for multi-person 3D motion trajectoryprediction. Our key observation is that a human's action and behaviors mayhighly depend on the other persons around. Thus, instead of predicting eachhuman pose trajectory in isolation, we introduce a Multi-Range Transformersmodel which contains of a local-range encoder for individual motion and aglobal-range encoder for social interactions. The Transformer decoder thenperforms prediction for each person by taking a corresponding pose as a querywhich attends to both local and global-range encoder features. Our model notonly outperforms state-of-the-art methods on long-term 3D motion prediction,but also generates diverse social interactions. More interestingly, our modelcan even predict 15-person motion simultaneously by automatically dividing thepersons into different interaction groups. Project page with code is availableat https://jiashunwang.github.io/MRT/.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Object Tracking

Multi-Task Learning

Method/Architecture

Computer Vision

Jiashun Wang Huazhe Xu Medhini Narasimhan Xiaolong Wang

Abstract

We propose a novel framework for multi-person 3D motion trajectoryprediction. Our key observation is that a human's action and behaviors mayhighly depend on the other persons around. Thus, instead of predicting eachhuman pose trajectory in isolation, we introduce a Multi-Range Transformersmodel which contains of a local-range encoder for individual motion and aglobal-range encoder for social interactions. The Transformer decoder thenperforms prediction for each person by taking a corresponding pose as a querywhich attends to both local and global-range encoder features. Our model notonly outperforms state-of-the-art methods on long-term 3D motion prediction,but also generates diverse social interactions. More interestingly, our modelcan even predict 15-person motion simultaneously by automatically dividing thepersons into different interaction groups. Project page with code is availableat https://jiashunwang.github.io/MRT/.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp