PIXOR: Real-time 3D Object Detection from Point Clouds

We address the problem of real-time 3D object detection from point clouds inthe context of autonomous driving. Computation speed is critical as detectionis a necessary component for safety. Existing approaches are, however,expensive in computation due to high dimensionality of point clouds. We utilizethe 3D data more efficiently by representing the scene from the Bird's Eye View(BEV), and propose PIXOR, a proposal-free, single-stage detector that outputsoriented 3D object estimates decoded from pixel-wise neural networkpredictions. The input representation, network architecture, and modeloptimization are especially designed to balance high accuracy and real-timeefficiency. We validate PIXOR on two datasets: the KITTI BEV object detectionbenchmark, and a large-scale 3D vehicle detection benchmark. In both datasetswe show that the proposed detector surpasses other state-of-the-art methodsnotably in terms of Average Precision (AP), while still runs at >28 FPS.