HyperAIHyperAI
2 months ago

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Hou, Xiuquan ; Liu, Meiqin ; Zhang, Senlin ; Wei, Ping ; Chen, Badong ; Lan, Xuguang
Relation DETR: Exploring Explicit Position Relation Prior for Object
  Detection
Abstract

This paper presents a general scheme for enhancing the convergence andperformance of DETR (DEtection TRansformer). We investigate the slowconvergence problem in transformers from a new perspective, suggesting that itarises from the self-attention that introduces no structural bias over inputs.To address this issue, we explore incorporating position relation prior asattention bias to augment object detection, following the verification of itsstatistical significance using a proposed quantitative macroscopic correlation(MC) metric. Our approach, termed Relation-DETR, introduces an encoder toconstruct position relation embeddings for progressive attention refinement,which further extends the traditional streaming pipeline of DETR into acontrastive relation pipeline to address the conflicts between non-duplicatepredictions and positive supervision. Extensive experiments on both generic andtask-specific datasets demonstrate the effectiveness of our approach. Under thesame configurations, Relation-DETR achieves a significant improvement (+2.0% APcompared to DINO), state-of-the-art performance (51.7% AP for 1x and 52.1% APfor 2x settings), and a remarkably faster convergence speed (over 40% AP withonly 2 training epochs) than existing DETR detectors on COCO val2017. Moreover,the proposed relation encoder serves as a universal plug-in-and-play component,bringing clear improvements for theoretically any DETR-like methods.Furthermore, we introduce a class-agnostic detection dataset, SA-Det-100k. Theexperimental results on the dataset illustrate that the proposed explicitposition relation achieves a clear improvement of 1.3% AP, highlighting itspotential towards universal object detection. The code and dataset areavailable at https://github.com/xiuqhou/Relation-DETR.

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | Latest Papers | HyperAI