DuReader Large-scale Open-domain Chinese Machine Reading Comprehension Dataset
Date
3 years ago
Size
4.11 GB
Publish URL
License
其他
Categories

DuReader is a large-scale open-domain Chinese dataset for machine reading comprehension, which can be used to train or evaluate machine reading comprehension models and systems.
The dataset consists of 200,000 questions, 420,000 answers, and 1 million documents. The questions and documents are based on Baidu Search and Baidu Knows, and the answers are manually generated. The dataset also provides annotations on the question type, and each question is manually labeled with its classification: Entity, Description, YesNo, Fact or Opinion.
DuReader.torrent
Seeding 1Downloading 1Completed 324Total Downloads 608