2 months ago

Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition

Gao, Bin-Bin ; Zhou, Hong-Yu

Abstract

Multi-label image recognition is a practical and challenging task compared tosingle-label image classification. However, previous works may be suboptimalbecause of a great number of object proposals or complex attentional regiongeneration modules. In this paper, we propose a simple but efficient two-streamframework to recognize multi-category objects from global image to localregions, similar to how human beings perceive objects. To bridge the gapbetween global and local streams, we propose a multi-class attentional regionmodule which aims to make the number of attentional regions as small aspossible and keep the diversity of these regions as high as possible. Ourmethod can efficiently and effectively recognize multi-class objects with anaffordable computation cost and a parameter-free region localization module.Over three benchmarks on multi-label image classification, we create newstate-of-the-art results with a single model only using image semantics withoutlabel dependency. In addition, the effectiveness of the proposed method isextensively demonstrated under different factors such as global poolingstrategy, input size and network architecture. Code has been made availableat~\url{https://github.com/gaobb/MCAR}.