Macro-Micro Adversarial Network for Human Parsing

In human parsing, the pixel-wise classification loss has drawbacks in itslow-level local inconsistency and high-level semantic inconsistency. Theintroduction of the adversarial network tackles the two problems using a singlediscriminator. However, the two types of parsing inconsistency are generated bydistinct mechanisms, so it is difficult for a single discriminator to solvethem both. To address the two kinds of inconsistencies, this paper proposes theMacro-Micro Adversarial Net (MMAN). It has two discriminators. Onediscriminator, Macro D, acts on the low-resolution label map and penalizessemantic inconsistency, e.g., misplaced body parts. The other discriminator,Micro D, focuses on multiple patches of the high-resolution label map toaddress the local inconsistency, e.g., blur and hole. Compared with traditionaladversarial networks, MMAN not only enforces local and semantic consistencyexplicitly, but also avoids the poor convergence problem of adversarialnetworks when handling high resolution images. In our experiment, we validatethat the two discriminators are complementary to each other in improving thehuman parsing accuracy. The proposed framework is capable of producingcompetitive parsing performance compared with the state-of-the-art methods,i.e., mIoU=46.81% and 59.91% on LIP and PASCAL-Person-Part, respectively. On arelatively small dataset PPSS, our pre-trained model demonstrates impressivegeneralization ability. The code is publicly available athttps://github.com/RoyalVane/MMAN.