ZeroSearch Question Answering Dataset
Date
Publish URL
Categories
ZeroSearch is a high-quality question-answering dataset released by Alibaba Tongyi Lab in 2025, focusing on building model capabilities that can directly answer questions without external search. The relevant paper results are:ZeroSearch: Incentivize the Search Capability of LLMs without Searching".
The dataset contains about 170,000 samples, covering multiple knowledge areas such as scientific knowledge, historical events, film and television entertainment, geography and humanities, etc. The dataset covers factual questions, definition questions, true and false questions, etc., which are suitable for training small and medium-sized question-answering models. Through carefully designed question-answer pairs, it aims to evaluate the model's common sense reasoning, fact memory and logical inference capabilities, and provides standardized training and testing resources for the field of natural language processing.