8 months ago

Image Recognition

Convolutional Neural Network

Multi-Task Learning

Method/Architecture

Computer Vision

Ke Zhu Jianxin Wu

Abstract

Multi-label image recognition is a challenging computer vision task ofpractical use. Progresses in this area, however, are often characterized bycomplicated methods, heavy computations, and lack of intuitive explanations. Toeffectively capture different spatial regions occupied by objects fromdifferent categories, we propose an embarrassingly simple module, namedclass-specific residual attention (CSRA). CSRA generates class-specificfeatures for every category by proposing a simple spatial attention score, andthen combines it with the class-agnostic average pooling feature. CSRA achievesstate-of-the-art results on multilabel recognition, and at the same time ismuch simpler than them. Furthermore, with only 4 lines of code, CSRA also leadsto consistent improvement across many diverse pretrained models and datasetswithout any extra training. CSRA is both easy to implement and light incomputations, which also enjoys intuitive explanations and visualizations.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Image Recognition

Convolutional Neural Network

Multi-Task Learning

Method/Architecture

Computer Vision

Ke Zhu Jianxin Wu

Abstract

Multi-label image recognition is a challenging computer vision task ofpractical use. Progresses in this area, however, are often characterized bycomplicated methods, heavy computations, and lack of intuitive explanations. Toeffectively capture different spatial regions occupied by objects fromdifferent categories, we propose an embarrassingly simple module, namedclass-specific residual attention (CSRA). CSRA generates class-specificfeatures for every category by proposing a simple spatial attention score, andthen combines it with the class-agnostic average pooling feature. CSRA achievesstate-of-the-art results on multilabel recognition, and at the same time ismuch simpler than them. Furthermore, with only 4 lines of code, CSRA also leadsto consistent improvement across many diverse pretrained models and datasetswithout any extra training. CSRA is both easy to implement and light incomputations, which also enjoys intuitive explanations and visualizations.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp