Date

3 months ago

Size

10.51 MB

1. Tutorial Introduction

LongCat-Video is an open-source AI video generation model with 13.6 billion parameters developed by Meituan's LongCat team. It excels in tasks such as text-to-video, image-to-video, and video continuation, particularly in efficiently generating high-quality long videos. The model utilizes multi-reward reinforcement learning optimization (GRPO) and demonstrates performance comparable to leading open-source video generation models and state-of-the-art commercial solutions in internal and public benchmarks. Related research papers are available. LongCat-Video Technical Report .

This tutorial uses a single RTX PRO 6000 GPU for computing power. Four examples are provided for testing: Image-to-Video, Text-to-Video, Long Video, and Video Continuation.

2. Effect display

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

3. Operation steps

1. Start the container

2. Usage steps

If "Bad Gateway" is displayed, it means that the model is initializing. Since the model is large, please wait about 5-6 minutes and then refresh the page.

1. Image-to-Video

Parameter Description:

Negative Prompt: By inputting unwanted elements, it guides the model to avoid these features, thereby improving the quality of the generated content.
Resolution: Specifies the width × height pixel dimensions of the generated image.
Seed: Controls the starting point for randomness in the generation process. A fixed Seed value ensures reproducible results.

2. Text-to-Video

Parameter Description:

Negative Prompt: By inputting unwanted elements, it guides the model to avoid these features, thereby improving the quality of the generated content.
Height: Specifies the height of the generated image.
Width: Specifies the width of the generated image.
Seed: Controls the starting point for randomness in the generation process. A fixed Seed value ensures reproducible results.

3. Long-Video Generation

Long-Video Generation takes approximately 20 minutes.

Parameter Description:

Negative Prompt: By inputting unwanted elements, it guides the model to avoid these features, thereby improving the quality of the generated content.
Number of Segments: The more segments, the longer the video.
Seed: Controls the starting point for randomness in the generation process. A fixed Seed value ensures reproducible results.

4. Video Continuation

The video-Continuation takes approximately 20 minutes.

Parameter Description:

Negative Prompt: By inputting unwanted elements, it guides the model to avoid these features, thereby improving the quality of the generated content.
Resolution: Specifies the width × height pixel dimensions of the generated image.
Seed: Controls the starting point for randomness in the generation process. A fixed Seed value ensures reproducible results.

Citation Information

The citation information for this project is as follows:

@misc{meituanlongcatteam2025longcatvideotechnicalreport,
      title={LongCat-Video Technical Report}, 
      author={Meituan LongCat Team and Xunliang Cai and Qilong Huang and Zhuoliang Kang and Hongyu Li and Shijun Liang and Liya Ma and Siyu Ren and Xiaoming Wei and Rixu Xie and Tong Zhang},
      year={2025},
      eprint={2510.22200},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.22200}, 
}

This notebook is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Command Palette

LongCat-Video: Meituan's open-source AI Video Generation Model

1. Tutorial Introduction

2. Effect display

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

3. Operation steps

1. Start the container

2. Usage steps

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

LongCat-Video: Meituan's open-source AI Video Generation Model

1. Tutorial Introduction

2. Effect display

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

3. Operation steps

1. Start the container

2. Usage steps

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

Citation Information

Related Notebooks

LongCat-Image: A Bilingual Text-Driven Image Generation System

Krea-realtime-video: Real-time Video Generation Model

SAM3: Visual Segmentation Model

LongCat-Image-Edit-Interface: A Bilingual Text-Driven Image Editing System

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Ovis-Image: High-quality Image Generation Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Build AI with AI

HyperAI Newsletters

Command Palette

LongCat-Video: Meituan's open-source AI Video Generation Model

1. Tutorial Introduction

2. Effect display

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

3. Operation steps

1. Start the container

2. Usage steps

1. Image-to-Video

2. Text-to-Video

3. Long-Video Generation

4. Video Continuation

Citation Information

Related Notebooks

LongCat-Image: A Bilingual Text-Driven Image Generation System

Krea-realtime-video: Real-time Video Generation Model

SAM3: Visual Segmentation Model

LongCat-Image-Edit-Interface: A Bilingual Text-Driven Image Editing System

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Ovis-Image: High-quality Image Generation Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Build AI with AI

HyperAI Newsletters

Related Notebooks

LongCat-Image: A Bilingual Text-Driven Image Generation System

Krea-realtime-video: Real-time Video Generation Model

SAM3: Visual Segmentation Model

LongCat-Image-Edit-Interface: A Bilingual Text-Driven Image Editing System

F5-E2 TTS Clones Any Sound in Just 3 Seconds