HyperAIHyperAI

Command Palette

Search for a command to run...

WebExplorer-QA Information Retrieval Question Answering Dataset

Date

2 months ago

Organization

Hong Kong University of Science and Technology
MiniMax
University of Waterloo

Paper URL

2509.06501

License

Apache 2.0

Join the Discord Community

WebExplorer-QA is a dataset for information retrieval and web browsing tasks released by the Hong Kong University of Science and Technology, MiniMax and the University of Waterloo in 2025. The related paper results are "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents", which aims to improve the model's performance in complex multi-step reasoning and long-range web navigation by systematically generating challenging query-answer pairs.

Currently, only 100 high-quality examples from this dataset are publicly available for academic research and community testing. These data are generated by model exploration to generate initial question-answer pairs, which are then iteratively refined through a "long-to-short" query evolution mechanism to increase the difficulty of the questions and the link between information retrieval and query accuracy. These question-answer pairs require the model to perform multi-step retrieval/browsing operations, aggregating information from multiple web pages to generate answers. These pairs are suitable for training and evaluating network agents or large language models in information seeking, multi-step reasoning, long-horizon context processing, tool calling, and web navigation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
WebExplorer-QA Information Retrieval Question Answering Dataset | Datasets | HyperAI