HyperAIHyperAI
17 days ago

Deep Hierarchical Representation of Point Cloud Videos via Spatio-Temporal Decomposition

{Mohan, Yi; Kankanhalli, Xin; Yang, Hehe; Yu, Fan}
Abstract

In point cloud videos, point coordinates are irregular and unordered but point timestamps exhibit regularities and order. Grid-based networks for conventional video processing cannot be directly used to model raw point cloud videos. Therefore, in this work, we propose a point-based network that directly handles raw point cloud videos. First, to preserve the spatio-temporal local structure of point cloud videos, we design a point tube covering a local range along spatial and temporal dimensions. By progressively subsampling frames and points and enlarging the spatial radius as the point features are fed into higher-level layers, the point tube can capture video structure in a spatio-temporally hierarchical manner. Second, to reduce the impact of the spatial irregularity on temporal modeling, we decompose space and time when extracting point tube representations. Specifically, a spatial operation is employed to capture the local structure of each spatial region in a tube and a temporal operation is used to model the dynamics of the spatial regions along the tube.

Deep Hierarchical Representation of Point Cloud Videos via Spatio-Temporal Decomposition | Latest Papers | HyperAI