4 months ago

Object Detection

Convolutional Neural Network

Semantic Segmentation

Method/Architecture

Computer Vision

Zhou Xinyu Yao Cong Wen He Wang Yuzhi Zhou Shuchang

Abstract

Previous approaches for scene text detection have already achieved promisingperformances across various benchmarks. However, they usually fall short whendealing with challenging scenarios, even when equipped with deep neural networkmodels, because the overall performance is determined by the interplay ofmultiple stages and components in the pipelines. In this work, we propose asimple yet powerful pipeline that yields fast and accurate text detection innatural scenes. The pipeline directly predicts words or text lines of arbitraryorientations and quadrilateral shapes in full images, eliminating unnecessaryintermediate steps (e.g., candidate aggregation and word partitioning), with asingle neural network. The simplicity of our pipeline allows concentratingefforts on designing loss functions and neural network architecture.Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500demonstrate that the proposed algorithm significantly outperformsstate-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fpsat 720p resolution.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

4 months ago

Object Detection

Convolutional Neural Network

Semantic Segmentation

Method/Architecture

Computer Vision

Zhou Xinyu Yao Cong Wen He Wang Yuzhi Zhou Shuchang

Abstract

Previous approaches for scene text detection have already achieved promisingperformances across various benchmarks. However, they usually fall short whendealing with challenging scenarios, even when equipped with deep neural networkmodels, because the overall performance is determined by the interplay ofmultiple stages and components in the pipelines. In this work, we propose asimple yet powerful pipeline that yields fast and accurate text detection innatural scenes. The pipeline directly predicts words or text lines of arbitraryorientations and quadrilateral shapes in full images, eliminating unnecessaryintermediate steps (e.g., candidate aggregation and word partitioning), with asingle neural network. The simplicity of our pipeline allows concentratingefforts on designing loss functions and neural network architecture.Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500demonstrate that the proposed algorithm significantly outperformsstate-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fpsat 720p resolution.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp