Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Guanting Dong, Yifei Chen, Xiaoxi Li, Jiajie Jin, Hongjin Qian, Yutao Zhu, Hangyu Mao, Guorui Zhou, Zhicheng Dou, Ji-Rong Wen

Date de publication: 5/25/2025

Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement
Learning

Résumé

Recently, large language models (LLMs) have shown remarkable reasoningcapabilities via large-scale reinforcement learning (RL). However, leveragingthe RL algorithm to empower effective multi-tool collaborative reasoning inLLMs remains an open challenge. In this paper, we introduce Tool-Star, anRL-based framework designed to empower LLMs to autonomously invoke multipleexternal tools during stepwise reasoning. Tool-Star integrates six types oftools and incorporates systematic designs in both data synthesis and training.To address the scarcity of tool-use data, we propose a general tool-integratedreasoning data synthesis pipeline, which combines tool-integrated promptingwith hint-based sampling to automatically and scalably generate tool-usetrajectories. A subsequent quality normalization and difficulty-awareclassification process filters out low-quality samples and organizes thedataset from easy to hard. Furthermore, we propose a two-stage trainingframework to enhance multi-tool collaborative reasoning by: (1) cold-startfine-tuning, which guides LLMs to explore reasoning patterns viatool-invocation feedback; and (2) a multi-tool self-critic RL algorithm withhierarchical reward design, which reinforces reward understanding and promoteseffective tool collaboration. Experimental analyses on over 10 challengingreasoning benchmarks highlight the effectiveness and efficiency of Tool-Star.The code is available at https://github.com/dongguanting/Tool-Star.

Voir les détails de l'article View Code