HyperAIHyperAI

Command Palette

Search for a command to run...

WorldGen Turns Text Prompts into Navigable 3D Worlds with Consistent, Interactive Environments

Imagine typing a simple prompt like “cartoon medieval village” or “sci-fi base station on Mars” and instantly generating a fully immersive, interactive 3D world that you can walk through and explore. The scene would be thematically and stylistically consistent—no anachronistic buildings or mismatched furniture—and geometrically coherent, allowing seamless navigation across connected spaces. Just a few years ago, this vision seemed far-fetched. But thanks to rapid advances in generative AI, researchers are now turning that dream into reality. Today, we’re unveiling WorldGen: a cutting-edge, end-to-end system that creates fully navigable, interactive 3D worlds from a single text prompt. Built on a fusion of procedural reasoning, diffusion-based 3D generation, and object-aware scene decomposition, WorldGen produces visually rich, geometrically accurate, and render-efficient environments ideal for gaming, simulation, and immersive social experiences. While previous AI systems have made impressive strides in generating high-quality 3D assets from text or images, most are limited to a narrow viewpoint or generate content incrementally from a central perspective. As a result, visual quality degrades quickly beyond a few meters. WorldGen overcomes this by first generating a global reference image of the entire scene, then reconstructing it into a full 3D layout with consistent geometry and textures across expansive areas—currently supporting scenes up to 50 x 50 meters—while preserving stylistic coherence throughout. The process involves multiple stages: planning, procedural blockout generation, navmesh extraction, reference image creation, image-to-3D reconstruction, and detailed refinement. Key innovations include accelerated object extraction using AutoPartGen, data curation for scene decomposition, and specialized models for mesh refinement and texturing. This multi-stage pipeline ensures that the final world is not only visually compelling but also functionally navigable, with no dead ends or broken connections. WorldGen’s output is fully compatible with major game engines like Unity and Unreal Engine, requiring no additional conversion or custom rendering pipelines. This makes it a powerful tool for rapid prototyping and content creation. Although still in the research phase and not yet available to developers, WorldGen represents a major leap forward in democratizing 3D world creation. It paves the way for a future where anyone—regardless of technical skill—can build entire virtual worlds with just a text prompt, aligning with the vision shared at Connect for a more accessible, creative digital future. We continue to work on improving the system, with future updates focused on expanding world size, reducing generation time, and enhancing overall quality. The current model, while limited in scale and speed, already demonstrates the transformative potential of AI in 3D content creation. We’d like to thank the team behind this work: Dilin Wang†, Hyunyoung Jung, Tom Monnier, Kihyuk Sohn, Chuhang Zou, Xiaoyu Xiang, Yu-Ying Yeh, Di Liu, Zixuan Huang, Thu Nguyen-Phuoc, Yuchen Fan, Sergiu Oprea, Ziyan Wang, Roman Shapovalov, Nikolaos Sarafianos, Thibault Groueix, Antoine Toisoul, Prithviraj Dhar, Xiao Chu, Minghao Chen, Geon Yeong Park, Mahima Gupta, Yassir Azziz, Milton Cadogan, Christopher Ocampo, Sandy Kao, Rakesh Ranjan†, and Andrea Vedaldi††—project lead.

Related Links