LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering

Although significant progress has been made in room layout estimation, mostmethods aim to reduce the loss in the 2D pixel coordinate rather thanexploiting the room structure in the 3D space. Towards reconstructing the roomlayout in 3D, we formulate the task of 360 layout estimation as a problem ofpredicting depth on the horizon line of a panorama. Specifically, we proposethe Differentiable Depth Rendering procedure to make the conversion from layoutto depth prediction differentiable, thus making our proposed model end-to-endtrainable while leveraging the 3D geometric information, without the need ofproviding the ground truth depth. Our method achieves state-of-the-artperformance on numerous 360 layout benchmark datasets. Moreover, ourformulation enables a pre-training step on the depth dataset, which furtherimproves the generalizability of our layout estimation model.