6 months ago

Object Detection

Computer Vision

Multi-Task Learning

Method/Architecture

Computer Vision

Maksim Kolodiazhnyi Anna Vorontsova Matvey Skripkin Danila Rukhovich Anton Konushin

Abstract

Growing customer demand for smart solutions in robotics and augmented realityhas attracted considerable attention to 3D object detection from point clouds.Yet, existing indoor datasets taken individually are too small andinsufficiently diverse to train a powerful and general 3D object detectionmodel. In the meantime, more general approaches utilizing foundation models arestill inferior in quality to those based on supervised training for a specifictask. In this work, we propose , a simple yet effective 3D objectdetection model, which is trained on a mixture of indoor datasets and iscapable of working in various indoor environments. By unifying different labelspaces, enables learning a strong representation across multipledatasets through a supervised joint training scheme. The proposed networkarchitecture is built upon a vanilla transformer encoder, making it easy torun, customize and extend the prediction pipeline for practical use. Extensiveexperiments demonstrate that obtains significant gains over existing 3Dobject detection methods in 6 indoor benchmarks: ScanNet (+1.1 mAP50),ARKitScenes (+19.4 mAP25), S3DIS (+9.1 mAP50), MultiScan (+9.3 mAP50), 3RScan(+3.2 mAP50), and ScanNet++ (+2.7 mAP50). Code is available athttps://github.com/filapro/unidet3d .

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

6 months ago

Object Detection

Computer Vision

Multi-Task Learning

Method/Architecture

Computer Vision

Maksim Kolodiazhnyi Anna Vorontsova Matvey Skripkin Danila Rukhovich Anton Konushin

Abstract

Growing customer demand for smart solutions in robotics and augmented realityhas attracted considerable attention to 3D object detection from point clouds.Yet, existing indoor datasets taken individually are too small andinsufficiently diverse to train a powerful and general 3D object detectionmodel. In the meantime, more general approaches utilizing foundation models arestill inferior in quality to those based on supervised training for a specifictask. In this work, we propose , a simple yet effective 3D objectdetection model, which is trained on a mixture of indoor datasets and iscapable of working in various indoor environments. By unifying different labelspaces, enables learning a strong representation across multipledatasets through a supervised joint training scheme. The proposed networkarchitecture is built upon a vanilla transformer encoder, making it easy torun, customize and extend the prediction pipeline for practical use. Extensiveexperiments demonstrate that obtains significant gains over existing 3Dobject detection methods in 6 indoor benchmarks: ScanNet (+1.1 mAP50),ARKitScenes (+19.4 mAP25), S3DIS (+9.1 mAP50), MultiScan (+9.3 mAP50), 3RScan(+3.2 mAP50), and ScanNet++ (+2.7 mAP50). Code is available athttps://github.com/filapro/unidet3d .

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

UniDet3D: Multi-dataset Indoor 3D Object Detection | Papers | HyperAI