GroundCUA Interface Operation Training Dataset
Date
Paper URL
License
MIT
GroundCUA is a real-world user interface (UI) dataset released in 2025 by the Mila Quebec Artificial Intelligence Institute in collaboration with McGill University, the University of Montreal, and other institutions. The related research paper is titled "Grounding Computer Use Agents on Human DemonstrationsThe goal is to support research on multimodal intelligent agents that can interact with computers.
This dataset contains approximately 56,000 desktop screenshots, covering 87 applications and 12 categories. Built upon expert-level human demonstrations, it includes over 3.56 million manually verified element-level annotations. It encompasses Windows, macOS, Linux, and various cross-platform software, covering common applications such as productivity tools, communication software, creative tools, system tools, and development environments. The data is stored categorized by software platform, facilitating the construction of scalable data processing pipelines.
Data composition:
- UI screenshot image (PNG)
- Element-level annotated JSON file:
- Element position and size (Bounding Box)
- Screen text content
- UI Function Category Tags
- Unique Element ID

Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.