4D Panoptic Segmentation as Invariant and Equivariant Field Prediction

In this paper, we develop rotation-equivariant neural networks for 4Dpanoptic segmentation. 4D panoptic segmentation is a benchmark task forautonomous driving that requires recognizing semantic classes and objectinstances on the road based on LiDAR scans, as well as assigning temporallyconsistent IDs to instances across time. We observe that the driving scenariois symmetric to rotations on the ground plane. Therefore, rotation-equivariancecould provide better generalization and more robust feature learning.Specifically, we review the object instance clustering strategies and restatethe centerness-based approach and the offset-based approach as the predictionof invariant scalar fields and equivariant vector fields. Other sub-tasks arealso unified from this perspective, and different invariant and equivariantlayers are designed to facilitate their predictions. Through evaluation on thestandard 4D panoptic segmentation benchmark of SemanticKITTI, we show that ourequivariant models achieve higher accuracy with lower computational costscompared to their non-equivariant counterparts. Moreover, our method sets thenew state-of-the-art performance and achieves 1st place on the SemanticKITTI 4DPanoptic Segmentation leaderboard.