HyperAIHyperAI

Command Palette

Search for a command to run...

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Zhou Yin Tuzel Oncel

Abstract

Accurate detection of objects in 3D point clouds is a central problem in manyapplications, such as autonomous navigation, housekeeping robots, andaugmented/virtual reality. To interface a highly sparse LiDAR point cloud witha region proposal network (RPN), most existing efforts have focused onhand-crafted feature representations, for example, a bird's eye viewprojection. In this work, we remove the need of manual feature engineering for3D point clouds and propose VoxelNet, a generic 3D detection network thatunifies feature extraction and bounding box prediction into a single stage,end-to-end trainable deep network. Specifically, VoxelNet divides a point cloudinto equally spaced 3D voxels and transforms a group of points within eachvoxel into a unified feature representation through the newly introduced voxelfeature encoding (VFE) layer. In this way, the point cloud is encoded as adescriptive volumetric representation, which is then connected to a RPN togenerate detections. Experiments on the KITTI car detection benchmark show thatVoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by alarge margin. Furthermore, our network learns an effective discriminativerepresentation of objects with various geometries, leading to encouragingresults in 3D detection of pedestrians and cyclists, based on only LiDAR.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp