HyperAIHyperAI

Command Palette

Search for a command to run...

MMMC Educational Video Generation Benchmark Dataset

Date

19 days ago

Organization

National University of Singapore

Paper URL

2510.01174

License

MIT

Join the Discord Community

MMMC is a large-scale multidisciplinary educational video generation benchmark dataset for teaching video generation released by the Show Lab of the National University of Singapore in 2025. The related paper results are "Code2Video: A Code-centric Paradigm for Educational Video Generation", which aims to provide high-quality training and evaluation resources for educational artificial intelligence models, and support research on automatically generating professional teaching videos from structured code and teaching content.

This dataset contains 117 complete instructional videos covering 13 subject areas, including calculus, geometry, probability theory, and neural networks. The average length of a full video is 1,014 seconds (approximately 16.9 minutes), while the average length of a segmented video is 201 seconds (approximately 3.35 minutes). The data is sourced from the 3Blue1Brown (3B1B) YouTube educational video library, renowned for its impactful teaching and exquisite animation production. MMMC was constructed based on two criteria: educational relevance, meaning each topic possesses pedagogical value; and actionable support, with each concept corresponding to a high-quality Manim reference to ensure visualization and reproducibility.

Dataset structure

  • Data files
    • metadata.jsonl: The main metadata file containing structured information for each video instance.
  • Each entry in metadata.jsonl contains:
    • id: Unique identifier of the video slice.
    • Category: High-level subject category (e.g., mathematics, physics, computer science).
    • Video: The file path of the corresponding educational video clip.
    • main_topics: List of teaching topics.
    • num_slices: The number of video slices the lecture is divided into.
    • reference_image: A key reference image related to the subject (optional).

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp