HyperAI

MiraData: A Large-scale Video Dataset With Long Duration and Structured Captions

Date

9 months ago

Size

315.23 MB

Organization

The Chinese University of Hong Kong

Publish URL

github.com

* This dataset supports online use.Click here to jump.

MiraData is a large video dataset jointly developed by Tencent PCG ARC Lab and the Chinese University of Hong Kong in 2024. It is designed for long video generation tasks. The paper results are "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"

Different from previous short video clips, MiraData focuses on uncut video clips of 1 to 2 minutes (average duration 72.1 seconds), and each video is equipped with structured descriptions from different angles, with an average description length of 318 words, ensuring a comprehensive presentation of the video content. It includes six types of descriptions: subject description, background, style, camera movement, short description, and dense description, which enhances the description depth of the dataset.

To ensure high-quality clips, the research team filtered the dataset into five subsets based on aesthetics, motion intensity, and color, selecting clips with high visual quality and strong motion intensity. To obtain detailed and accurate descriptions, the research team first generated short subtitles using a state-of-the-art subtitle generator, and then enriched them using GPT-4V to generate dense subtitles. In order to provide fine-grained video descriptions from multiple perspectives.

The MiraData dataset provides valuable resources and new challenges for researchers in the fields of long video generation, video content understanding and generation.

MiraData.torrent
Seeding 1Downloading 1Completed 80Total Downloads 76
  • MiraData/
    • README.md
      2.02 KB
    • README.txt
      4.04 KB
      • data/
        • MiraData.zip
          315.23 MB