HyperAI

ZeroSearch Question Answering Dataset

Date

18 days ago

Organization

Publish URL

huggingface.co

Download Help

ZeroSearch is a high-quality question-answering dataset released by Alibaba Tongyi Lab in 2025, focusing on building model capabilities that can directly answer questions without external search. The relevant paper results are:ZeroSearch: Incentivize the Search Capability of LLMs without Searching".

The dataset contains about 170,000 samples, covering multiple knowledge areas such as scientific knowledge, historical events, film and television entertainment, geography and humanities, etc. The dataset covers factual questions, definition questions, true and false questions, etc., which are suitable for training small and medium-sized question-answering models. Through carefully designed question-answer pairs, it aims to evaluate the model's common sense reasoning, fact memory and logical inference capabilities, and provides standardized training and testing resources for the field of natural language processing.