HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Abstract

Reinforcement Learning with Verifiable Rewards (RLVR) has demonstratedsuccess in enhancing LLM reasoning capabilities, but remains limited tosingle-turn interactions without tool integration. While recent AgenticReinforcement Learning with Tool use (ARLT) approaches have emerged to addressmulti-turn tool interactions, existing works develop task-specific codebasesthat suffer from fragmentation, synchronous execution bottlenecks, and limitedextensibility across domains. These inefficiencies hinder broader communityadoption and algorithmic innovation. We introduce VerlTool, a unified andmodular framework that addresses these limitations through systematic designprinciples. VerlTool provides four key contributions: (1) upstream alignmentwith VeRL ensuring compatibility and simplified maintenance, (2) unified toolmanagement via standardized APIs supporting diverse modalities including codeexecution, search, SQL databases, and vision processing, (3) asynchronousrollout execution achieving near 2times speedup by eliminatingsynchronization bottlenecks, and (4) comprehensive evaluation demonstratingcompetitive performance across 6 ARLT domains. Our framework formalizes ARLT asmulti-turn trajectories with multi-modal observation tokens (text/image/video),extending beyond single-turn RLVR paradigms. We train and evaluate models onmathematical reasoning, knowledge QA, SQL generation, visual reasoning, websearch, and software engineering tasks, achieving results comparable tospecialized systems while providing unified training infrastructure. Themodular plugin architecture enables rapid tool integration requiring onlylightweight Python definitions, significantly reducing development overhead andproviding a scalable foundation for tool-augmented RL research. Our code isopen-sourced at https://github.com/TIGER-AI-Lab/verl-tool.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use | Papers | HyperAI