Date

6 hours ago

Paper URL

2602.11685

License

MIT

Tags

Finance

Medicine

Artificial Intelligence

The DRACO cross-domain deep research benchmark dataset is a dataset released by the Perplexity team for evaluating complex research tasks. Related papers include... DRACO: A Cross-Domain Benchmark for Deep Research Accuracy, Completeness, and ObjectivityThe aim is to systematically evaluate the comprehensive capabilities of in-depth research systems in terms of accuracy, completeness, and objectivity. This dataset contains 100 complex research tasks, covering 40 countries and regions across five continents, and encompassing 10 major application areas including finance, shopping/product comparison, academia, and technology. Each task corresponds to a multi-step, multi-source information retrieval and analysis problem, and is accompanied by evaluation criteria designed and validated by 26 domain experts. Each criterion contains an average of approximately 40 evaluation metrics, providing fine-grained evaluation of the model output from four dimensions: factual accuracy, breadth and depth of analysis, presentation quality, and citation quality. The task distribution by field is shown in the following figure:

Data Fields:

idThe unique identifier for the task.
domainThe domain to which the task belongs
problemComplete research query requiring answers
answerThe evaluation criteria are encoded in JSON and include specific standards for each evaluation dimension.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset Discuss on Discord

Date

6 hours ago

Paper URL

2602.11685

License

MIT

Related Datasets

MCIF Multimodal Cross-Language Instruction Following Dataset

3 months ago

CL-bench Context Learning Evaluation Benchmark Dataset

a month ago

Pan-Cancer scRNA-Seq Cancer Single-Cell Transcriptional Atlas Dataset

a month ago

CHIMERA General Inference Synthetic Dataset

a month ago

Open-RL Inference Problem Dataset

a month ago

Groundsource Global Flood Events Dataset

23 days ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Related Datasets

MCIF Multimodal Cross-Language Instruction Following Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Pan-Cancer scRNA-Seq Cancer Single-Cell Transcriptional Atlas Dataset

CHIMERA General Inference Synthetic Dataset

Open-RL Inference Problem Dataset

Groundsource Global Flood Events Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

DRACO Cross-Disciplinary Deep Research Benchmark Dataset

Related Datasets

MCIF Multimodal Cross-Language Instruction Following Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Pan-Cancer scRNA-Seq Cancer Single-Cell Transcriptional Atlas Dataset

CHIMERA General Inference Synthetic Dataset

Open-RL Inference Problem Dataset

Groundsource Global Flood Events Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

MCIF Multimodal Cross-Language Instruction Following Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Pan-Cancer scRNA-Seq Cancer Single-Cell Transcriptional Atlas Dataset

CHIMERA General Inference Synthetic Dataset

Open-RL Inference Problem Dataset

Groundsource Global Flood Events Dataset

Related Datasets

MCIF Multimodal Cross-Language Instruction Following Dataset

CL-bench Context Learning Evaluation Benchmark Dataset

Pan-Cancer scRNA-Seq Cancer Single-Cell Transcriptional Atlas Dataset

CHIMERA General Inference Synthetic Dataset

Open-RL Inference Problem Dataset

Groundsource Global Flood Events Dataset