HyperAIHyperAI
2 months ago

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Long, Shangbang ; Ruan, Jiaqiang ; Zhang, Wenjie ; He, Xin ; Wu, Wenhao ; Yao, Cong
TextSnake: A Flexible Representation for Detecting Text of Arbitrary
  Shapes
Abstract

Driven by deep neural networks and large scale datasets, scene text detectionmethods have progressed substantially over the past years, continuouslyrefreshing the performance records on various standard benchmarks. However,limited by the representations (axis-aligned rectangles, rotated rectangles orquadrangles) adopted to describe text, existing methods may fall short whendealing with much more free-form text instances, such as curved text, which areactually very common in real-world scenarios. To tackle this problem, wepropose a more flexible representation for scene text, termed as TextSnake,which is able to effectively represent text instances in horizontal, orientedand curved forms. In TextSnake, a text instance is described as a sequence ofordered, overlapping disks centered at symmetric axes, each of which isassociated with potentially variable radius and orientation. Such geometryattributes are estimated via a Fully Convolutional Network (FCN) model. Inexperiments, the text detector based on TextSnake achieves state-of-the-art orcomparable performance on Total-Text and SCUT-CTW1500, the two newly publishedbenchmarks with special emphasis on curved text in natural images, as well asthe widely-used datasets ICDAR 2015 and MSRA-TD500. Specifically, TextSnakeoutperforms the baseline on Total-Text by more than 40% in F-measure.