RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features

Object reconstruction from 3D point clouds has been a long-standing researchproblem in computer vision and computer graphics, and achieved impressiveprogress. However, reconstruction from time-varying point clouds (a.k.a. 4Dpoint clouds) is generally overlooked. In this paper, we propose a new networkarchitecture, namely RFNet-4D++, that jointly reconstructs objects and theirmotion flows from 4D point clouds. The key insight is simultaneously performingboth tasks via learning of spatial and temporal features from a sequence ofpoint clouds can leverage individual tasks, leading to improved overallperformance. To prove this ability, we design a temporal vector field learningmodule using an unsupervised learning approach for flow estimation task,leveraged by supervised learning of spatial structures for objectreconstruction. Extensive experiments and analyses on benchmark datasetsvalidated the effectiveness and efficiency of our method. As shown inexperimental results, our method achieves state-of-the-art performance on bothflow estimation and object reconstruction while performing much faster thanexisting methods in both training and inference. Our code and data areavailable at https://github.com/hkust-vgd/RFNet-4D