DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection

Few-shot object detection, which aims at detecting novel objects rapidly fromextremely few annotated examples of previously unseen classes, has attractedsignificant research interest in the community. Most existing approaches employthe Faster R-CNN as basic detection framework, yet, due to the lack of tailoredconsiderations for data-scarce scenario, their performance is often notsatisfactory. In this paper, we look closely into the conventional Faster R-CNNand analyze its contradictions from two orthogonal perspectives, namelymulti-stage (RPN vs. RCNN) and multi-task (classification vs. localization). Toresolve these issues, we propose a simple yet effective architecture, namedDecoupled Faster R-CNN (DeFRCN). To be concrete, we extend Faster R-CNN byintroducing Gradient Decoupled Layer for multi-stage decoupling andPrototypical Calibration Block for multi-task decoupling. The former is a noveldeep layer with redefining the feature-forward operation and gradient-backwardoperation for decoupling its subsequent layer and preceding layer, and thelatter is an offline prototype-based classification model with taking theproposals from detector as input and boosting the original classificationscores with additional pairwise scores for calibration. Extensive experimentson multiple benchmarks show our framework is remarkably superior to otherexisting approaches and establishes a new state-of-the-art in few-shotliterature.