Command Palette
Search for a command to run...
WGO-Bench Robot Video Benchmark Dataset
Date
License
Non-Commercial
WGO-Bench is a robot video benchmark dataset released by Macrodata Labs. It aims to evaluate the ability of visual language models to convert robot and first-person action videos into timestamped subtask annotations. This dataset primarily focuses on two tasks: boundary detection and subtask annotation. The annotation labels emphasize describing the complete action events and state changes visible in the video clips.
Dataset composition:
- It contains 100 video episodes, encompassing 743 key sub-tasks and 63 unique task instructions.
- The data sources are divided into three categories: HomER first-person videos (25 videos), RoboInter DROID robotic arm videos (50 videos), and RoboCOIN Galaxea R1 Lite head-mounted camera videos (25 videos).
- The data is stored in Parquet format, with video files (MP4 bytes) directly embedded in each line of data.
Data Fields:
- id: A stable, unique identifier for a video clip.
- video: Directly embedded MP4 format video binary data
- instruction: The high-level task instruction corresponding to this segment
- segments: A list of gold-labeled segments, each element containing start_sec (start time), end_sec (end time), and subtask (subtask description).
- metadata: Source-specific additional information in JSON format
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.