HyperAI초신경

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Sheng Chen, Peiyu He, Jiaxin Hu, Ziyang Liu, Yansheng Wang, Tao Xu, Chi Zhang, Chongchong Zhang, Chao An, Shiyu Cai, Duo Cao, Kangping Chen, Shuai Chu, Tianwei Chu, Mingdi Dan, Min Du, Weiwei Fang, Pengyou Fu, Junkai Hu, Xiaowei Jiang, Zhaodi Jiang, Fuxuan Li, Jun Li, Minghui Li, Mingyao Li, Yanchang Li, Zhibin Li, Guangming Liu, Kairui Liu, Lihao Liu, Weizhi Liu, Xiaoshun Liu, Yufei Liu, Yunfei Liu, Qiang Lu, Yuanfei Luo, Xiang Lv, Hongying Ma, Sai Ma, Lingxian Mi, Sha Sa, Hongxiang Shu, Lei Tian, Chengzhi Wang, Jiayu Wang, Kaijie Wang, Qingyi Wang, Renwen Wang, Tao Wang, Wei Wang, Xirui Wang, Chao Wei, Xuguang Wei, Zijun Xia, Zhaohao Xiao, Tingshuai Yan, Liyan Yang, Yifan Yang, Zhikai Yang, Zhong Yin, Li Yuan, Liuchun Yuan, Chi Zhang, Jinyang Zhang, Junhui Zhang, Linge Zhang, Zhenyi Zhang, Zheyu Zhang, Dongjie Zhu, Hang Li, Yangang Zhang
발행일: 6/10/2025
Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal
  Learning
초록

Modern robot navigation systems encounter difficulties in diverse and complexindoor environments. Traditional approaches rely on multiple modules with smallmodels or rule-based systems and thus lack adaptability to new environments. Toaddress this, we developed Astra, a comprehensive dual-model architecture,Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, amultimodal LLM, processes vision and language inputs to perform self and goallocalization using a hybrid topological-semantic graph as the global map, andoutperforms traditional visual place recognition methods. Astra-Local, amultitask network, handles local path planning and odometry estimation. Its 4Dspatial-temporal encoder, trained through self-supervised learning, generatesrobust 4D features for downstream tasks. The planning head utilizes flowmatching and a novel masked ESDF loss to minimize collision risks forgenerating local trajectories, and the odometry head integrates multi-sensorinputs via a transformer encoder to predict the relative pose of the robot.Deployed on real in-house mobile robots, Astra achieves high end-to-end missionsuccess rate across diverse indoor environments.