Date

4 years ago

Size

242.66 MB

Organization

Publish URL

ai.baidu.com

Paper URL

tcci.ccf.org.cn

License

Non-Commercial

Tags

Text Generation

DuIE is a large-scale manually annotated dataset that can be used to evaluate architecture-based knowledge extraction algorithms. The dataset contains more than 210,000 real-world Chinese sentences, involving more than 450,000 SPO triples (i.e., Subject-Predicate-Object triples), consisting of a pre-specified structure and 49 predicates. All sentences in this dataset are extracted from Baidu Baike and Baidu News Search. The texts in this dataset cover various fields in real-world applications, such as news, entertainment, and user-generated content. The dataset consists of the following data:

214,590 sentences, of which:

172,983 sentences are used as training set;
21,626 sentences are for development set;
19,981 sentences are used as the test set.

457,866 instances, of which:

363,960 instances are training set;
45,558 instances are development set;
48,348 instances are in the test set. Example data:

DuIE.torrent

Seeding 2Downloading 0Completed 654Total Downloads 1,451

DuIE/
- README.md
  1.53 KB
- README.txt
  3.07 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

4 years ago

Size

242.66 MB

Organization

Publish URL

ai.baidu.com

Paper URL

tcci.ccf.org.cn

License

Non-Commercial

Related Datasets

THINGS-EEG EEG Dataset

5 months ago

THINGS-MEG Magnetoencephalography Dataset

5 months ago

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

5 months ago

TransPhy3D Transparent Reflection Synthesis Video Dataset

5 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

DuIE Large-Scale Chinese Information Extraction Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

DuIE Large-Scale Chinese Information Extraction Dataset

Related Datasets

THINGS-EEG EEG Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

DuIE Large-Scale Chinese Information Extraction Dataset

Related Datasets

THINGS-EEG EEG Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

THINGS-EEG EEG Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

Related Datasets

THINGS-EEG EEG Dataset

THINGS-MEG Magnetoencephalography Dataset

THINGS-fMRI Functional Magnetic Resonance Imaging Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset