Date

2 years ago

Size

315.23 MB

Organization

Publish URL

github.com

Paper URL

arxiv.org

* This dataset supports online use.Click here to jump.

MiraData is a large video dataset jointly developed by Tencent PCG ARC Lab and the Chinese University of Hong Kong in 2024. It is designed for long video generation tasks. The paper results are "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions" Different from previous short video clips, MiraData focuses on uncut video clips of 1 to 2 minutes (average duration 72.1 seconds), and each video is equipped with structured descriptions from different angles, with an average description length of 318 words, ensuring a comprehensive presentation of the video content. It includes six types of descriptions: subject description, background, style, camera movement, short description, and dense description, which enhances the description depth of the dataset. To ensure high-quality clips, the research team filtered the dataset into five subsets based on aesthetics, motion intensity, and color, selecting clips with high visual quality and strong motion intensity. To obtain detailed and accurate descriptions, the research team first generated short subtitles using a state-of-the-art subtitle generator, and then enriched them using GPT-4V to generate dense subtitles. In order to provide fine-grained video descriptions from multiple perspectives. The MiraData dataset provides valuable resources and new challenges for researchers in the fields of long video generation, video content understanding and generation.

Citation

If you find this code useful for your research, please cite it:

MiraData.torrent

Seeding 1Downloading 0Completed 199Total Downloads 268

MiraData/
- README.md
  2.02 KB
- README.txt
  4.04 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

2 years ago

Size

315.23 MB

Organization

Publish URL

github.com

Paper URL

arxiv.org

* This dataset supports online use.Click here to jump.

Citation

If you find this code useful for your research, please cite it:

MiraData.torrent

Seeding 1Downloading 0Completed 199Total Downloads 268

MiraData/
- README.md
  2.02 KB
- README.txt
  4.04 KB

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

11 hours ago

ChartNet Chart Understanding Multimodal Dataset

a month ago

ViMU Video Metaphor Understanding Dataset

a month ago

MathNet Multimodal Mathematical Benchmark Inference Dataset

a month ago

Long-Distance Wildfire & Smoke Detection Dataset

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

MiraData: A large-scale Video Dataset With Long Duration and Structured Captions

* This dataset supports online use.Click here to jump.

Citation

Build AI with AI

HyperAI Newsletters

Command Palette

MiraData: A large-scale Video Dataset With Long Duration and Structured Captions

* This dataset supports online use.Click here to jump.

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

ViMU Video Metaphor Understanding Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Long-Distance Wildfire & Smoke Detection Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

MiraData: A large-scale Video Dataset With Long Duration and Structured Captions

* This dataset supports online use.Click here to jump.

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

ViMU Video Metaphor Understanding Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Long-Distance Wildfire & Smoke Detection Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

ViMU Video Metaphor Understanding Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Long-Distance Wildfire & Smoke Detection Dataset

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

ViMU Video Metaphor Understanding Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Long-Distance Wildfire & Smoke Detection Dataset