HyperAIHyperAI

Command Palette

Search for a command to run...

M3-Bench Long Video Question Answering Benchmark Dataset

Date

3 months ago

Organization

ByteDance Seed

Paper URL

2508.09736

License

Non-Commercial

M3-Bench is a long video question answering benchmark dataset released by ByteDance Seed Team in 2025. The related paper results are "Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory", which aims to evaluate the long-term memory and reasoning ability of multimodal intelligent agents.

The dataset contains 1,020 video samples, each of which includes captions, intermediate outputs, and memory maps. M3-Bench uses long video open-ended question answering (VQA) as its core task, with each video accompanied by a set of open-ended questions.

Data composition:

  • M3-Bench-robot: 100 new first-person videos of real-world scenarios (from the robot's perspective) recorded by the research team
  • M3-Bench-web: 920 long videos from the internet, covering a wider range of content and scenarios

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
M3-Bench Long Video Question Answering Benchmark Dataset | Datasets | HyperAI