HyperAIHyperAI
2 months ago

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

Yao, Ruijie ; Jin, Sheng ; Xu, Lumin ; Zeng, Wang ; Liu, Wentao ; Qian, Chen ; Luo, Ping ; Wu, Ji
GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for
  Multi-Label Image Recognition
Abstract

Multi-Label Image Recognition (MLIR) is a challenging task that aims topredict multiple object labels in a single image while modeling the complexrelationships between labels and image regions. Although convolutional neuralnetworks and vision transformers have succeeded in processing images as regulargrids of pixels or patches, these representations are sub-optimal for capturingirregular and discontinuous regions of interest. In this work, we present thefirst fully graph convolutional model, Group K-nearest neighbor based Graphconvolutional Network (GKGNet), which models the connections between semanticlabel embeddings and image patches in a flexible and unified graph structure.To address the scale variance of different objects and to capture informationfrom multiple perspectives, we propose the Group KGCN module for dynamic graphconstruction and message passing. Our experiments demonstrate that GKGNetachieves state-of-the-art performance with significantly lower computationalcosts on the challenging multi-label datasets, i.e., MS-COCO and VOC2007datasets. Codes are available at https://github.com/jin-s13/GKGNet.

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition | Latest Papers | HyperAI