8 months ago

Computer Vision

Convolutional Neural Network

Method/Architecture

Computer Vision

Hui Li Peng Wang Chunhua Shen Guyu Zhang

Abstract

Recognizing irregular text in natural scene images is challenging due to thelarge variance in text appearance, such as curvature, orientation anddistortion. Most existing approaches rely heavily on sophisticated modeldesigns and/or extra fine-grained annotations, which, to some extent, increasethe difficulty in algorithm implementation and data collection. In this work,we propose an easy-to-implement strong baseline for irregular scene textrecognition, using off-the-shelf neural network components and only word-levelannotations. It is composed of a $31$ -layer ResNet, an LSTM-basedencoder-decoder framework and a 2-dimensional attention module. Despite itssimplicity, the proposed method is robust and achieves state-of-the-artperformance on both regular and irregular scene text recognition benchmarks.Code is available at: https://tinyurl.com/ShowAttendRead

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Computer Vision

Convolutional Neural Network

Method/Architecture

Computer Vision

Hui Li Peng Wang Chunhua Shen Guyu Zhang

Abstract

Recognizing irregular text in natural scene images is challenging due to thelarge variance in text appearance, such as curvature, orientation anddistortion. Most existing approaches rely heavily on sophisticated modeldesigns and/or extra fine-grained annotations, which, to some extent, increasethe difficulty in algorithm implementation and data collection. In this work,we propose an easy-to-implement strong baseline for irregular scene textrecognition, using off-the-shelf neural network components and only word-levelannotations. It is composed of a $31$ -layer ResNet, an LSTM-basedencoder-decoder framework and a 2-dimensional attention module. Despite itssimplicity, the proposed method is robust and achieves state-of-the-artperformance on both regular and irregular scene text recognition benchmarks.Code is available at: https://tinyurl.com/ShowAttendRead

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition | Papers | HyperAI