HyperAI
14 days ago

Matrix-Game: Interactive World Foundation Model

Yifan Zhang, Chunli Peng, Boyang Wang, Puyi Wang, Qingcheng Zhu, Fei Kang, Biao Jiang, Zedong Gao, Eric Li, Yang Liu, Yahui Zhou
Matrix-Game: Interactive World Foundation Model
Abstract

We introduce Matrix-Game, an interactive world foundation model forcontrollable game world generation. Matrix-Game is trained using a two-stagepipeline that first performs large-scale unlabeled pretraining for environmentunderstanding, followed by action-labeled training for interactive videogeneration. To support this, we curate Matrix-Game-MC, a comprehensiveMinecraft dataset comprising over 2,700 hours of unlabeled gameplay video clipsand over 1,000 hours of high-quality labeled clips with fine-grained keyboardand mouse action annotations. Our model adopts a controllable image-to-worldgeneration paradigm, conditioned on a reference image, motion context, and useractions. With over 17 billion parameters, Matrix-Game enables precise controlover character actions and camera movements, while maintaining high visualquality and temporal coherence. To evaluate performance, we develop GameWorldScore, a unified benchmark measuring visual quality, temporal quality, actioncontrollability, and physical rule understanding for Minecraft worldgeneration. Extensive experiments show that Matrix-Game consistentlyoutperforms prior open-source Minecraft world models (including Oasis andMineWorld) across all metrics, with particularly strong gains incontrollability and physical consistency. Double-blind human evaluationsfurther confirm the superiority of Matrix-Game, highlighting its ability togenerate perceptually realistic and precisely controllable videos acrossdiverse game scenarios. To facilitate future research on interactiveimage-to-world generation, we will open-source the Matrix-Game model weightsand the GameWorld Score benchmark at https://github.com/SkyworkAI/Matrix-Game.