HyperAIHyperAI

Command Palette

Search for a command to run...

DTrOCR: Decoder-only Transformer for Optical Character Recognition

Fujitake Masato

Abstract

Typical text recognition methods rely on an encoder-decoder structure, inwhich the encoder extracts features from an image, and the decoder producesrecognized text from these features. In this study, we propose a simpler andmore effective method for text recognition, known as the Decoder-onlyTransformer for Optical Character Recognition (DTrOCR). This method uses adecoder-only Transformer to take advantage of a generative language model thatis pre-trained on a large corpus. We examined whether a generative languagemodel that has been successful in natural language processing can also beeffective for text recognition in computer vision. Our experiments demonstratedthat DTrOCR outperforms current state-of-the-art methods by a large margin inthe recognition of printed, handwritten, and scene text in both English andChinese.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
DTrOCR: Decoder-only Transformer for Optical Character Recognition | Papers | HyperAI