HyperAI

NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance

Wenzhe Cai, Jiaqi Peng, Yuqiang Yang, Yujian Zhang, Meng Wei, Hanqing Wang, Yilun Chen, Tai Wang, Jiangmiao Pang
تاريخ النشر: 5/14/2025
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged
  Information Guidance
الملخص

Learning navigation in dynamic open-world environments is an important yetchallenging skill for robots. Most previous methods rely on preciselocalization and mapping or learn from expensive real-world demonstrations. Inthis paper, we propose the Navigation Diffusion Policy (NavDP), an end-to-endframework trained solely in simulation and can zero-shot transfer to differentembodiments in diverse real-world environments. The key ingredient of NavDP'snetwork is the combination of diffusion-based trajectory generation and acritic function for trajectory selection, which are conditioned on only localobservation tokens encoded from a shared policy transformer. Given theprivileged information of the global environment in simulation, we scale up thedemonstrations of good quality to train the diffusion policy and formulate thecritic value function targets with contrastive negative samples. Ourdemonstration generation approach achieves about 2,500 trajectories/GPU perday, 20times more efficient than real-world data collection, and results ina large-scale navigation dataset with 363.2km trajectories across 1244 scenes.Trained with this simulation dataset, NavDP achieves state-of-the-artperformance and consistently outstanding generalization capability onquadruped, wheeled, and humanoid robots in diverse indoor and outdoorenvironments. In addition, we present a preliminary attempt at using GaussianSplatting to make in-domain real-to-sim fine-tuning to further bridge thesim-to-real gap. Experiments show that adding such real-to-sim data can improvethe success rate by 30\% without hurting its generalization capability.