HyperAI

HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation

Haiyang Zhou, Wangbo Yu, Jiawen Guan, Xinhua Cheng, Yonghong Tian, Li Yuan
Date de publication: 5/13/2025
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene
  Generation
Résumé

The rapid advancement of diffusion models holds the promise ofrevolutionizing the application of VR and AR technologies, which typicallyrequire scene-level 4D assets for user experience. Nonetheless, existingdiffusion models predominantly concentrate on modeling static 3D scenes orobject-level dynamics, constraining their capacity to provide truly immersiveexperiences. To address this issue, we propose HoloTime, a framework thatintegrates video diffusion models to generate panoramic videos from a singleprompt or reference image, along with a 360-degree 4D scene reconstructionmethod that seamlessly transforms the generated panoramic video into 4D assets,enabling a fully immersive 4D experience for users. Specifically, to tame videodiffusion models for generating high-fidelity panoramic videos, we introducethe 360World dataset, the first comprehensive collection of panoramic videossuitable for downstream 4D scene reconstruction tasks. With this curateddataset, we propose Panoramic Animator, a two-stage image-to-video diffusionmodel that can convert panoramic images into high-quality panoramic videos.Following this, we present Panoramic Space-Time Reconstruction, which leveragesa space-time depth estimation method to transform the generated panoramicvideos into 4D point clouds, enabling the optimization of a holistic 4DGaussian Splatting representation to reconstruct spatially and temporallyconsistent 4D scenes. To validate the efficacy of our method, we conducted acomparative analysis with existing approaches, revealing its superiority inboth panoramic video generation and 4D scene reconstruction. This demonstratesour method's capability to create more engaging and realistic immersiveenvironments, thereby enhancing user experiences in VR and AR applications.