2D Semantic-Guided Semantic Scene Completion
Semantic scene completion (SSC) aims to simultaneously perform scene completion (SC) and predict semantic categoriesof a 3D scene from a single depth and/or RGB image. Most existing SSC methods struggle to handle complex regions withmultiple objects close to each other, especially for objects with reflective or dark surfaces. This primarily stems from twochallenges: (1) the loss of geometric information due to the unreliability of depth values from sensors, and (2) the potential forsemantic confusion when simultaneously predicting 3D shapes and semantic labels. To address these problems, we propose aSemantic-guided Semantic Scene Completion framework, dubbed SG-SSC, which involves Semantic-guided Fusion (SGF)and Volume-guided Semantic Predictor (VGSP). Guided by 2D semantic segmentation maps, SGF adaptively fuses RGBand depth features to compensate for the missing geometric information caused by the missing values in depth images, thusperforming more robustly to unreliable depth information. VGSP exploits the mutual benefit between SC and SSC tasks,making SSC more focused on predicting the categories of voxels with high occupancy probabilities and also allowing SCto utilize semantic priors to better predict voxel occupancy. Experimental results show that SG-SSC outperforms existingstate-of-the-art methods on the NYU, NYUCAD, and SemanticKITTI datasets. Models and code are available at https://github.com/aipixel/SG-SSC.