Command Palette
Search for a command to run...
AgentNet Desktop Operation Task Dataset
Date
Paper URL
License
MIT
AgentNet is the first large-scale desktop computer-based intelligent agent trajectory dataset released in 2025 by the XLANG Laboratory of the University of Hong Kong, in collaboration with Moonshot AI, Stanford University and other institutions. The related paper results are "OPENCUA: Open Foundations for Computer-Use Agents", which aims to support and evaluate cross-platform GUI operation agents and vision-language-action (VLA) models.
This dataset contains 22.6K manually annotated computer usage task traces, covering Windows, macOS, and Ubuntu, and over 200 applications and websites. The scenarios fall into four categories: office, professional, daily, and system. It is suitable for training and evaluating desktop automation, multi-application processes, and cross-platform agents.
Data structures and fields
Each sample contains:
- Task metadata: task number (task_id), instruction (instruction);
- Quality rating: completion, consistency, efficiency, and difficulty;
- Summary description: natural_language_task, actual_task;
- Trajectory array: traj (operation steps recorded in chronological order).
Trajectory steps (traj)structure:
- Each step contains index, image (screenshot), and value objects:
- observation (scene observation), thought (thinking/planning), action (natural language action), code (executable code, such as PyAutoGUI), last_step_correct, last_step_redundant, and reflection.

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.