DocBank Text Dataset
Date
3 years ago
Size
48.1 GB
Publish URL
Paper URL

DocBank is a text dataset. The dataset contains 500,000 document pages with fine-grained, term-level annotations for document layout analysis. The dataset is constructed in a simple and effective way with weak supervision from \LaTeX{} documents available on arXiv.com.
DocBank.torrent
Seeding 2Downloading 0Completed 419Total Downloads 751
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
AI Co-coding
Ready-to-use GPUs
Best Pricing
Hyper Newsletters
Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp