HyperAI

DuReader Large-scale Open-domain Chinese Machine Reading Comprehension Dataset

Date

3 years ago

Size

4.11 GB

Organization

Baidu

Publish URL

ai.baidu.com

License

其他

特色图像

DuReader is a large-scale open-domain Chinese dataset for machine reading comprehension, which can be used to train or evaluate machine reading comprehension models and systems.

The dataset consists of 200,000 questions, 420,000 answers, and 1 million documents. The questions and documents are based on Baidu Search and Baidu Knows, and the answers are manually generated. The dataset also provides annotations on the question type, and each question is manually labeled with its classification: Entity, Description, YesNo, Fact or Opinion.

DuReader.torrent
Seeding 1Downloading 1Completed 324Total Downloads 608
  • DuReader/
    • README.md
      1.21 KB
    • README.txt
      2.41 KB
      • data/
        • dureader_preprocessed.zip
          2.79 GB
        • dureader_raw.zip
          4.11 GB