Command Palette
Search for a command to run...
OCRBench-v2 Text Recognition Benchmark Dataset
Date
Size
Paper URL
*This dataset supports online use.Click here to jump.
OCRBench-v2 is a multimodal large-scale model optical character recognition (OCR) evaluation benchmark released in 2025 by Huazhong University of Science and Technology, South China University of Technology, ByteDance and other institutions. The relevant paper results are "OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning", which aims to evaluate the OCR capabilities of large multimodal models (LMMs) in different text-related tasks.
This dataset is a large-scale upgrade based on OCRBench. It includes 10,000 manually verified Chinese-English question-and-answer pairs as a public test set, and an additional private test set consisting of 1,500 manually annotated rich text images from a variety of sources, including print books, e-books, scanned documents, and web content. The data covers 31 typical text scenarios and 23 subtasks, categorized into eight core OCR functions (text recognition, text detection, text reference location, relationship extraction, element parsing, mathematical operations, visual-text understanding, and knowledge reasoning).
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.