HyperAIHyperAI
2 months ago

Deep High-Resolution Representation Learning for Visual Recognition

Wang, Jingdong ; Sun, Ke ; Cheng, Tianheng ; Jiang, Borui ; Deng, Chaorui ; Zhao, Yang ; Liu, Dong ; Mu, Yadong ; Tan, Mingkui ; Wang, Xinggang ; Liu, Wenyu ; Xiao, Bin
Deep High-Resolution Representation Learning for Visual Recognition
Abstract

High-resolution representations are essential for position-sensitive visionproblems, such as human pose estimation, semantic segmentation, and objectdetection. Existing state-of-the-art frameworks first encode the input image asa low-resolution representation through a subnetwork that is formed byconnecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet,VGGNet), and then recover the high-resolution representation from the encodedlow-resolution representation. Instead, our proposed network, named asHigh-Resolution Network (HRNet), maintains high-resolution representationsthrough the whole process. There are two key characteristics: (i) Connect thehigh-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedlyexchange the information across resolutions. The benefit is that the resultingrepresentation is semantically richer and spatially more precise. We show thesuperiority of the proposed HRNet in a wide range of applications, includinghuman pose estimation, semantic segmentation, and object detection, suggestingthat the HRNet is a stronger backbone for computer vision problems. All thecodes are available at~{\url{https://github.com/HRNet}}.

Deep High-Resolution Representation Learning for Visual Recognition | Latest Papers | HyperAI