Discuss on Discord

Date

a month ago

Organization

Publish URL

Paper URL

License

Apache 2.0

Tags

Intelligent Question Answering

Natural Language Processing

DeepSearchQA is an information retrieval and factual evaluation dataset for large language models and intelligent agents, released by Google DeepMind in 2025. Related research papers include... DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research AgentsThe aim is to evaluate the model's planning ability, context preservation ability, and comprehensive utilization of open network information in complex, multi-step information search tasks.

This dataset contains 900 manually designed evaluation samples covering 17 different domains. Each sample consists of a question prompt, the corresponding question domain category, a standard answer for evaluation, and an answer type label. Answer types are distinguished as single answers and set answers, with approximately 651 TP3T of questions requiring the model to provide a complete set of answers. All questions are designed in a "causal chain" format, meaning that subsequent information retrieval depends on the search results of previous steps, requiring the model to execute multi-step search plans and maintain long-term contextual consistency. All tasks are based on open networks, ensuring objective and verifiable answers. It is primarily used to evaluate large language models or intelligent agent systems with network search capabilities.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Discuss on Discord

Date

a month ago

Organization

Publish URL

Paper URL

DeepSearchQA

License

Apache 2.0

Tags

Intelligent Question Answering

Natural Language Processing

DeepSearchQA is an information retrieval and factual evaluation dataset for large language models and intelligent agents, released by Google DeepMind in 2025. Related research papers include... DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research AgentsThe aim is to evaluate the model's planning ability, context preservation ability, and comprehensive utilization of open network information in complex, multi-step information search tasks.

This dataset contains 900 manually designed evaluation samples covering 17 different domains. Each sample consists of a question prompt, the corresponding question domain category, a standard answer for evaluation, and an answer type label. Answer types are distinguished as single answers and set answers, with approximately 651 TP3T of questions requiring the model to provide a complete set of answers. All questions are designed in a "causal chain" format, meaning that subsequent information retrieval depends on the search results of previous steps, requiring the model to execute multi-step search plans and maintain long-term contextual consistency. All tasks are based on open networks, ensuring objective and verifiable answers. It is primarily used to evaluate large language models or intelligent agent systems with network search capabilities.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp