a month ago

LongLive: Real-time Interactive Long Video Generation

Shuai Yang Wei Huang Ruihang Chu Yicheng Xiao Yuyang Zhao Xianbang Wang Muyang Li Enze Xie Yingcong Chen Yao Lu

Abstract

We present LongLive, a frame-level autoregressive (AR) framework forreal-time and interactive long video generation. Long video generation presentschallenges in both efficiency and quality. Diffusion and Diffusion-Forcingmodels can produce high-quality videos but suffer from low efficiency due tobidirectional attention. Causal attention AR models support KV caching forfaster inference, but often degrade in quality on long videos due to memorychallenges during long-video training. In addition, beyond static prompt-basedgeneration, interactive capabilities, such as streaming prompt inputs, arecritical for dynamic content creation, enabling users to guide narratives inreal time. This interactive requirement significantly increases complexity,especially in ensuring visual consistency and semantic coherence during prompttransitions. To address these challenges, LongLive adopts a causal, frame-levelAR design that integrates a KV-recache mechanism that refreshes cached stateswith new prompts for smooth, adherent switches; streaming long tuning to enablelong video training and to align training and inference (train-long-test-long);and short window attention paired with a frame-level attention sink, shorten asframe sink, preserving long-range consistency while enabling faster generation.With these key designs, LongLive fine-tunes a 1.3B-parameter short-clip modelto minute-long generation in just 32 GPU-days. At inference, LongLive sustains20.7 FPS on a single NVIDIA H100, achieves strong performance on VBench in bothshort and long videos. LongLive supports up to 240-second videos on a singleH100 GPU. LongLive further supports INT8-quantized inference with only marginalquality loss.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

LongLive: Real-time Interactive Long Video Generation

Shuai Yang Wei Huang Ruihang Chu Yicheng Xiao Yuyang Zhao Xianbang Wang Muyang Li Enze Xie Yingcong Chen Yao Lu2 more

Abstract

Build AI with AI

Hyper Newsletters

Shuai Yang Wei Huang Ruihang Chu Yicheng Xiao Yuyang Zhao Xianbang Wang Muyang Li Enze Xie Yingcong Chen Yao Lu