HyperAIHyperAI
2 months ago

CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection

Shrivastava, Shubham ; Chakravarty, Punarjay
CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based
  3D Object Detection
Abstract

We introduce a method for 3D object detection using a single monocular image.Starting from a synthetic dataset, we pre-train an RGB-to-Depth Auto-Encoder(AE). The embedding learnt from this AE is then used to train a 3D ObjectDetector (3DOD) CNN which is used to regress the parameters of 3D object posesafter the encoder from the AE generates a latent embedding from the RGB image.We show that we can pre-train the AE using paired RGB and depth images fromsimulation data once and subsequently only train the 3DOD network using realdata, comprising of RGB images and 3D object pose labels (without therequirement of dense depth). Our 3DOD network utilizes a particular`cubification' of 3D space around the camera, where each cuboid is tasked withpredicting N object poses, along with their class and confidence values. The AEpre-training and this method of dividing the 3D space around the camera intocuboids give our method its name - CubifAE-3D. We demonstrate results formonocular 3D object detection in the Autonomous Vehicle (AV) use-case with theVirtual KITTI 2 and the KITTI datasets.

CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection | Latest Papers | HyperAI