Command Palette
Search for a command to run...

Abstract
World modeling has become a cornerstone in AI research, enabling agents tounderstand, represent, and predict the dynamic environments they inhabit. Whileprior work largely emphasizes generative methods for 2D image and video data,they overlook the rapidly growing body of work that leverages native 3D and 4Drepresentations such as RGB-D imagery, occupancy grids, and LiDAR point cloudsfor large-scale scene modeling. At the same time, the absence of a standardizeddefinition and taxonomy for ``world models'' has led to fragmented andsometimes inconsistent claims in the literature. This survey addresses thesegaps by presenting the first comprehensive review explicitly dedicated to 3Dand 4D world modeling and generation. We establish precise definitions,introduce a structured taxonomy spanning video-based (VideoGen),occupancy-based (OccGen), and LiDAR-based (LiDARGen) approaches, andsystematically summarize datasets and evaluation metrics tailored to 3D/4Dsettings. We further discuss practical applications, identify open challenges,and highlight promising research directions, aiming to provide a coherent andfoundational reference for advancing the field. A systematic summary ofexisting literature is available at https://github.com/worldbench/survey
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.