HyperAIHyperAI

Command Palette

Search for a command to run...

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

Denis Coquenet Clément Chatelain Thierry Paquet

Abstract

Unconstrained handwritten text recognition is a challenging computer visiontask. It is traditionally handled by a two-step approach, combining linesegmentation followed by text line recognition. For the first time, we proposean end-to-end segmentation-free architecture for the task of handwrittendocument recognition: the Document Attention Network. In addition to textrecognition, the model is trained to label text parts using begin and end tagsin an XML-like fashion. This model is made up of an FCN encoder for featureextraction and a stack of transformer decoder layers for a recurrenttoken-by-token prediction process. It takes whole text documents as input andsequentially outputs characters, as well as logical layout tokens. Contrary tothe existing segmentation-based approaches, the model is trained without usingany segmentation label. We achieve competitive results on the READ 2016 datasetat page level, as well as double-page level with a CER of 3.43% and 3.70%,respectively. We also provide results for the RIMES 2009 dataset at page level,reaching 4.54% of CER. We provide all source code and pre-trained model weights athttps://github.com/FactoDeepLearning/DAN.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp