RolmOCR Cross-scenario Ultra-fast OCR Open Source Recognition New Benchmark

1. Tutorial Introduction
RolmOCR is an open source OCR tool developed by the Reducto AI team in April 2025, based on the Qwen2.5-VL-7B visual language model. It can extract text from images and PDFs quickly and with low memory, outperforming similar tools such as olmOCR. RolmOCR does not rely on PDF metadata, simplifies the process and supports multiple document types, such as handwritten notes and academic papers. The Reducto team aims to improve the efficiency of document digitization through model updates and training data optimization.
This tutorial uses RolmOCR as a demonstration, the image uses vllm 0.7.3-2204, and the computing resource uses RTX 4090.
2. Function List
- Fast text extraction: Extract text from images and PDFs with fast processing speed, suitable for large amounts of documents.
- Supports a variety of documents: can recognize handwritten notes, printed documents and complex tables.
- Open source and free: Released under the Apache 2.0 license, the code can be freely downloaded and adapted.
- Low memory usage: It is more resource-efficient than olmOCR and has low computer requirements when running.
- No metadata required: Work directly with the original document without relying on additional information from the PDF.
- Enhanced tilted document recognition: 15% is rotated in the training data to improve the adaptability to documents with non-positive angles.
- Based on the latest model: Using Qwen2.5-VL-7B to improve recognition accuracy and efficiency.
3. Operation steps
1. After starting the container, click the API address to enter the Web interface
If "Bad Gateway" is displayed, it means the model is initializing. Please wait for about 1-2 minutes and refresh the page.

2. Functional Demonstration


Citation Information
Thanks to GitHub user boyswu For the production of this tutorial, the project reference information is as follows:
@misc{RolmOCR,
author = {Reducto AI},
title = {RolmOCR: A Faster, Lighter Open Source OCR Model},
year = {2025},
}
Exchange and discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓
