HyperAIHyperAI

Command Palette

Search for a command to run...

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

Siyuan Huang Yichen Xie Song-Chun Zhu Yixin Zhu

Abstract

To date, various 3D scene understanding tasks still lack practical andgeneralizable pre-trained models, primarily due to the intricate nature of 3Dscene understanding tasks and their immense variations introduced by cameraviews, lighting, occlusions, etc. In this paper, we tackle this challenge byintroducing a spatio-temporal representation learning (STRL) framework, capableof learning from unlabeled 3D point clouds in a self-supervised fashion.Inspired by how infants learn from visual data in the wild, we explore the richspatio-temporal cues derived from the 3D data. Specifically, STRL takes twotemporally-correlated frames from a 3D point cloud sequence as the input,transforms it with the spatial data augmentation, and learns the invariantrepresentation self-supervisedly. To corroborate the efficacy of STRL, weconduct extensive experiments on three types (synthetic, indoor, and outdoor)of datasets. Experimental results demonstrate that, compared with supervisedlearning methods, the learned self-supervised representation facilitatesvarious models to attain comparable or even better performances while capableof generalizing pre-trained models to downstream tasks, including 3D shapeclassification, 3D object detection, and 3D semantic segmentation. Moreover,the spatio-temporal contextual cues embedded in 3D point clouds significantlyimprove the learned representations.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp