Use this Dataset

Discuss on Discord

Date

a year ago

Size

1.49 GB

Organization

Publish URL

Paper URL

Tags

The CC-OCR dataset was jointly developed by Alibaba Group, Huazhong University of Science and Technology, and South China University of Technology in 2024 to provide a comprehensive and challenging benchmark for evaluating the performance of large multimodal models in text recognition (OCR) tasks.CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy". The dataset covers four core tasks: multi-scene text reading, multi-language text reading, document parsing, and key information extraction, and contains 39 subsets and 7,058 fully annotated images. The launch of CC-OCR fills the gap in the evaluation of current multimodal models in terms of complex structures and fine-grained visual challenges, and is of great significance to promoting the progress of multimodal models in practical applications.

CC-OCR.torrent

Seeding 2Downloading 0Completed 232Total Downloads 420

CC-OCR/
- README.md
  1.52 KB
- README.txt
  3.04 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Use this Dataset

Discuss on Discord

Date

a year ago

Size

1.49 GB

Organization

Publish URL

Paper URL

arxiv.org

Tags

The CC-OCR dataset was jointly developed by Alibaba Group, Huazhong University of Science and Technology, and South China University of Technology in 2024 to provide a comprehensive and challenging benchmark for evaluating the performance of large multimodal models in text recognition (OCR) tasks.CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy". The dataset covers four core tasks: multi-scene text reading, multi-language text reading, document parsing, and key information extraction, and contains 39 subsets and 7,058 fully annotated images. The launch of CC-OCR fills the gap in the evaluation of current multimodal models in terms of complex structures and fine-grained visual challenges, and is of great significance to promoting the progress of multimodal models in practical applications.

CC-OCR.torrent

Seeding 2Downloading 0Completed 232Total Downloads 420

CC-OCR/
- README.md
  1.52 KB
- README.txt
  3.04 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp