HyperAIHyperAI

Command Palette

Search for a command to run...

2 months ago

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Abstract

Vision-Language-Action (VLA) models have recently emerged as a powerfulparadigm for robotic manipulation. Despite substantial progress enabled bylarge-scale pretraining and supervised fine-tuning (SFT), these models face twofundamental challenges: (i) the scarcity and high cost of large-scalehuman-operated robotic trajectories required for SFT scaling, and (ii) limitedgeneralization to tasks involving distribution shift. Recent breakthroughs inLarge Reasoning Models (LRMs) demonstrate that reinforcement learning (RL) candramatically enhance step-by-step reasoning capabilities, raising a naturalquestion: Can RL similarly improve the long-horizon step-by-step actionplanning of VLA? In this work, we introduce SimpleVLA-RL, an efficient RLframework tailored for VLA models. Building upon veRL, we introduceVLA-specific trajectory sampling, scalable parallelization, multi-environmentrendering, and optimized loss computation. When applied to OpenVLA-OFT,SimpleVLA-RL achieves SoTA performance on LIBERO and even outperforms pi_0on RoboTwin 1.0\&2.0 with the exploration-enhancing strategies we introduce.SimpleVLA-RL not only reduces dependence on large-scale data and enables robustgeneralization, but also remarkably surpasses SFT in real-world tasks.Moreover, we identify a novel phenomenon ``pushcut'' during RL training,wherein the policy discovers previously unseen patterns beyond those seen inthe previous training process. Github: https://github.com/PRIME-RL/SimpleVLA-RL

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp