HyperAIHyperAI

Command Palette

Search for a command to run...

OCRBench Text Recognition Benchmark Dataset

Date

4 months ago

Size

60.8 MB

Organization

Huazhong University of Science and Technology

Paper URL

arxiv.org

OCRBench is a text recognition benchmark dataset released by Huazhong University of Science and Technology and Microsoft Research. This dataset is an evaluation benchmark for multimodal large-scale optical character recognition (OCR). The relevant paper results are:OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models", which aims to evaluate the OCR capabilities of large multimodal models (LMMs) in different text-related tasks.

The dataset contains 1000 manually screened and corrected question-answer pairs from five representative text-related tasks: text recognition, scene text centering, document orientation, key information extraction (KIE), and handwritten mathematical expressions (HMER).

The data includes:

  • Text recognition 300 images (including regular, irregular, artistic and other text types).
  • Scene Text Centric Visual Question Answering 200 Questions.
  • Document-guided visual question answering 200 questions.
  • 200 questions for key information extraction.
  • Handwritten mathematical expression recognition 100 images from the HME100k dataset.
OCRBench.torrent
Seeding 1Downloading 0Completed 67Total Downloads 169
  • OCRBench/
    • README.md
      1.65 KB
    • README.txt
      3.3 KB
      • data/
        • OCRBench.zip
          60.8 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp