a day ago

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

HunyuanWorld Team, Zhenwei Wang, Yuhao Liu, Junta Wu, Zixiao Gu, Haoyuan Wang, Xuhui Zuo, Tianyu Huang, Wenhuan Li, Sheng Zhang, Yihang Lian, Yulin Tsai, Lifu Wang, Sicong Liu, Puhua Jiang, Xianghui Yang, Dongyuan Guo, Yixuan Tang, Xinyue Mao, Jiaao Yu, Junlin Yu, Jihong Zhang, Meng Chen, Liang Dong, Yiwen Jia, Chao Zhang, Yonghao Tan, Hao Zhang, Zheng Ye, Peng He, Runzhou Wu, Minghui Chen, Zhan Li, Wangchen Qin, Lei Wang, Yifu Sun, Lin Niu, Xiang Yuan, Xiaofeng Yang, Yingping He, Jie Xiao, Yangyu Tao, Jianchen Zhu, Jinbao Xue, Kai Liu, Chongqing Zhao, Xinming Wu, Tian Liu, Peng Chen, Di Wang, Yuhong Liu, Linus, Jie Jiang, Tengfei Wang, Chunchao Guo

View Paper Details View Code

Abstract

Creating immersive and playable 3D worlds from texts or images remains afundamental challenge in computer vision and graphics. Existing worldgeneration approaches typically fall into two categories: video-based methodsthat offer rich diversity but lack 3D consistency and rendering efficiency, and3D-based methods that provide geometric consistency but struggle with limitedtraining data and memory-inefficient representations. To address theselimitations, we present HunyuanWorld 1.0, a novel framework that combines thebest of both worlds for generating immersive, explorable, and interactive 3Dscenes from text and image conditions. Our approach features three keyadvantages: 1) 360{\deg} immersive experiences via panoramic world proxies; 2)mesh export capabilities for seamless compatibility with existing computergraphics pipelines; 3) disentangled object representations for augmentedinteractivity. The core of our framework is a semantically layered 3D meshrepresentation that leverages panoramic images as 360{\deg} world proxies forsemantic-aware world decomposition and reconstruction, enabling the generationof diverse 3D worlds. Extensive experiments demonstrate that our methodachieves state-of-the-art performance in generating coherent, explorable, andinteractive 3D worlds while enabling versatile applications in virtual reality,physical simulation, game development, and interactive content creation.