HyperAIHyperAI

Command Palette

Search for a command to run...

WorldGen Turns Text Prompts into Navigable 3D Worlds with Consistent, Immersive Design

Imagine typing a simple prompt like “cartoon medieval village” or “sci-fi base station on Mars” and instantly generating a fully interactive 3D world that you can walk through, explore, and experience. The world is stylistically consistent—no anachronistic buildings or mismatched furniture—and it’s built to be navigable, with seamless connections between areas and no dead ends or stuck characters. Just a few years ago, this would have seemed like science fiction. But thanks to rapid advances in generative AI, we’re now moving closer to that reality. Today, we’re unveiling WorldGen: a state-of-the-art, end-to-end system that creates fully navigable, interactive 3D worlds from a single text prompt. WorldGen integrates procedural reasoning, diffusion-based 3D generation, and object-aware scene decomposition to produce geometrically accurate, visually rich, and render-efficient environments. These worlds are designed for use in gaming, virtual simulations, and immersive social experiences. While generative AI has made significant progress in creating high-quality 3D assets from text or image inputs, most existing methods are limited. Many generate content from a single viewpoint, building outward from a central perspective. As a result, quality degrades quickly beyond a few meters, and the overall scene often lacks structural coherence. WorldGen overcomes these limitations by first generating a global reference image of the entire scene, then using advanced image-to-3D reconstruction techniques to build a full, consistent 3D environment. The process involves multiple stages: - Planning: Procedural blockout generation and navmesh extraction - Reference image generation: High-fidelity scene representation - Reconstruction: Image-to-3D base model creation - Navmesh-based scene generation: Ensures walkability and spatial logic - Initial texture generation and decomposition: Object extraction using an accelerated version of AutoPartGen - Data curation and refinement: Image enhancement, mesh refinement, and texturing The result is a fully textured, 50 x 50 meter scene that maintains visual and geometric consistency across its entire area. Unlike other systems, WorldGen does not suffer from quality drop-off as users move through the world. We are already working on scaling this to even larger environments and reducing generation time. Although still in the research phase and not yet available to developers, the 3D worlds created by WorldGen are compatible with standard game engines like Unity and Unreal Engine, requiring no additional conversion or custom rendering pipelines. While WorldGen represents a major leap forward in AI-driven 3D content creation, it is not without limitations. Future versions will focus on supporting larger world sizes, improving generation speed, and enhancing object interactivity. The goal is to make the entire process faster, more scalable, and more accessible. Creating 3D environments has traditionally been a time-intensive, expert-driven process, often requiring specialized skills and significant resources. WorldGen demonstrates the potential to dramatically reduce both time and cost, opening the door to a new era of democratized 3D content creation. This aligns with the vision we shared at Connect: a future where anyone, regardless of technical background, can build entire virtual worlds with just a few words. We extend our gratitude to the team whose work made this possible: Dilin Wang†, Hyunyoung Jung, Tom Monnier, Kihyuk Sohn, Chuhang Zou, Xiaoyu Xiang, Yu-Ying Yeh, Di Liu, Zixuan Huang, Thu Nguyen-Phuoc, Yuchen Fan, Sergiu Oprea, Ziyan Wang, Roman Shapovalov, Nikolaos Sarafianos, Thibault Groueix, Antoine Toisoul, Prithviraj Dhar, Xiao Chu, Minghao Chen, Geon Yeong Park, Mahima Gupta, Yassir Azziz, Milton Cadogan, Christopher Ocampo, Sandy Kao, Rakesh Ranjan†, Andrea Vedaldi††, project lead.

Related Links