Multi-Label Image Recognition with Graph Convolutional Networks

The task of multi-label image recognition is to predict a set of objectlabels that present in an image. As objects normally co-occur in an image, itis desirable to model the label dependencies to improve the recognitionperformance. To capture and explore such important dependencies, we propose amulti-label classification model based on Graph Convolutional Network (GCN).The model builds a directed graph over the object labels, where each node(label) is represented by word embeddings of a label, and GCN is learned to mapthis label graph into a set of inter-dependent object classifiers. Theseclassifiers are applied to the image descriptors extracted by another sub-net,enabling the whole network to be end-to-end trainable. Furthermore, we proposea novel re-weighted scheme to create an effective label correlation matrix toguide information propagation among the nodes in GCN. Experiments on twomulti-label image recognition datasets show that our approach obviouslyoutperforms other existing state-of-the-art methods. In addition, visualizationanalyses reveal that the classifiers learned by our model maintain meaningfulsemantic topology.