Command Palette
Search for a command to run...
DeepSearchQA Multi-Step Information Search Question Answering Dataset
Date
Publish URL
Paper URL
License
Apache 2.0
DeepSearchQA is an information retrieval and factual evaluation dataset for large language models and intelligent agents, released by Google DeepMind in 2025. Related research papers include... DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research AgentsThe aim is to evaluate the model's planning ability, context preservation ability, and comprehensive utilization of open network information in complex, multi-step information search tasks.
This dataset contains 900 manually designed evaluation samples covering 17 different domains. Each sample consists of a question prompt, the corresponding question domain category, a standard answer for evaluation, and an answer type label. Answer types are distinguished as single answers and set answers, with approximately 651 TP3T of questions requiring the model to provide a complete set of answers. All questions are designed in a "causal chain" format, meaning that subsequent information retrieval depends on the search results of previous steps, requiring the model to execute multi-step search plans and maintain long-term contextual consistency. All tasks are based on open networks, ensuring objective and verifiable answers. It is primarily used to evaluate large language models or intelligent agent systems with network search capabilities.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.