HyperAIHyperAI

Command Palette

Search for a command to run...

DeepPlanning Long-Term Planning Capability Assessment Dataset

Discuss on Discord

Date

4 hours ago

Organization

Alibaba Group

Paper URL

2601.18137

License

Apache 2.0

DeepPlanning is a dataset for evaluating the planning capabilities of intelligent agents, released by the Qwen team in 2026. The related paper is... DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable ConstraintsThe aim is to evaluate the reasoning and decision-making abilities of intelligent agents in complex, long-term planning tasks.

This dataset includes two types of tasks: multi-day travel planning and multi-item shopping planning. The travel planning task contains 120 independent task examples, available in both Chinese and English. Each task corresponds to an independent environment and includes structured background data covering information such as transportation, accommodation, attractions, schedules, and costs, averaging approximately 7,700 relevant records. The shopping planning task includes 120 independent task examples in English, each equipped with a product database covering information such as product prices, inventory, discount rules, and budget constraints, averaging approximately 170 records.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp