NTU Unveils First Simulation-Ready 3D Model for Robot Training
Researchers at Singapore’s Nanyang Technological University have unveiled PhysX-Omni, a unified 3D generative framework that produces physics-ready digital assets from single images. Led by Associate Professor Liu Ziwei, the team’s work bridges a critical gap in computer vision and robotics by embedding real-world physical properties, such as mass, material stiffness, joint kinematics, and absolute scale, directly into generated models. Published on arXiv, the framework establishes a scalable data infrastructure for embodied AI and physical simulation. Traditional 3D generation has largely ignored physical realism, often producing visually plausible but functionally inert assets. PhysX-Omni unifies the generation of rigid, deformable, and articulated objects within a single pipeline. The architecture leverages a 7-billion-parameter vision-language model and introduces a novel template-based run-length encoding scheme. This method slices three-dimensional meshes into two-dimensional masks, compresses them into text tokens, and preserves high-resolution geometric detail while reducing computational overhead. Consequently, the model achieves unprecedented accuracy in physical property prediction, slashing absolute scale error from approximately 300 units in prior systems to just 2.79, representing a two-order-of-magnitude improvement. To train and evaluate the framework, the researchers constructed PhysXVerse, a curated dataset comprising over 8,700 high-fidelity assets across 2,900 indoor and outdoor categories, complete with verified physical annotations. They also introduced PhysX-Bench, a simulation-driven evaluation benchmark that tests geometric fidelity, material properties, kinematic behavior, and functional affordances without relying on costly manual labeling. Across these metrics, PhysX-Omni outperformed existing methods, demonstrating strong generalization in open-world scenarios. The practical impact centers on lowering barriers to high-fidelity simulation. By generating assets that require no manual post-processing and integrating seamlessly with mainstream physics engines, the framework cuts simulation preparation costs to one-tenth or one-twentieth of conventional software. Initial deployments in robotic strategy learning, including garment folding, cabinet manipulation, and appliance operation, showcase stable dynamic interactions and reliable physical consistency. Industry players, including Daosheng Robotics and several Silicon Valley startups, have already expressed interest in integrating the technology into their simulation pipelines. Beyond robotics, PhysX-Omni holds promise for reducing production cycles in gaming and virtual production, while potentially accelerating AI-driven scientific discovery by substituting labor-intensive physical experiments with high-precision virtual testing. The research team plans to expand the model capabilities toward scene-level synthesis, improved long-tail data utilization, and more naturalistic spatial relationships. As the field moves from synthetic validation to real-world deployment, PhysX-Omni establishes a foundational bridge between visual generation and interactive physical intelligence.
