Efficient Scene Text Localization and Recognition with Local Character Refinement

An unconstrained end-to-end text localization and recognition method ispresented. The method detects initial text hypothesis in a single pass by anefficient region-based method and subsequently refines the text hypothesisusing a more robust local text model, which deviates from the common assumptionof region-based methods that all characters are detected as connectedcomponents. Additionally, a novel feature based on character stroke area estimation isintroduced. The feature is efficiently computed from a region distance map, itis invariant to scaling and rotations and allows to efficiently detect textregions regardless of what portion of text they capture. The method runs in real time and achieves state-of-the-art text localizationand recognition results on the ICDAR 2013 Robust Reading dataset.