HyperAIHyperAI
11 days ago

RCFusion: Fusing 4-D Radar and Camera With Bird’s-Eye View Features for 3-D Object Detection

{Zhixiong Ma, Xichan Zhu, Jie Bai, Libo Huang, Sihan Chen, Long Yan, Bin Tan, Sen Li, Lianqing Zheng}
Abstract

Camera and millimeter-wave (MMW) radar fusion is essential for accurate and robust autonomous driving systems. With the advancement of radar technology, next-generation high-resolution automotive radar, i.e., 4-D radar, has emerged. In addition to the target range, azimuth, and Doppler velocity measurements of traditional radar, 4-D radar provides elevation measurement to create a denser “point cloud.” In this study, we propose a camera and 4-D radar fusion network called RCFusion, which achieves multimodal feature fusion under a unified bird’s-eye view (BEV) space to accomplish 3-D object detection tasks. In the camera stream, multiscale feature maps are obtained by the image backbone and feature pyramid network (FPN); they are then converted into orthographic feature maps by an orthographic feature transform (OFT). Next, enhanced and fine-grained image BEV features are obtained via a designed shared attention encoder. Meanwhile, in the 4-D radar stream, a newly designed component named radar PillarNet efficiently encodes the radar features to generate radar pseudo-images, which are fed into the point cloud backbone to create radar BEV features. An interactive attention module (IAM) is proposed for the fusion stage, which outputs a valid fusion of the two-modal BEV features. Finally, a generic detection head predicts the object classes and locations. The proposed RCFusion is validated on the TJ4DRadSet and view-of-delft (VoD) datasets. The experimental results and analysis show that the proposed method can effectively fuse camera and 4-D radar features to achieve robust detection performance.

RCFusion: Fusing 4-D Radar and Camera With Bird’s-Eye View Features for 3-D Object Detection | Latest Papers | HyperAI