ROCKET-2 is a visual motion control framework for cross-view target alignment launched by the CraftJarvis team on March 21, 2025. It focuses on solving complex task control problems in the field of robotics. This project significantly improves the generalization ability and controllability of the visual motion policy (Visuomotor Policy) in dynamic environments through innovative multi-view target alignment technology. The related paper results are "ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment".
This tutorial uses resources for a single RTX 4090 card.
2. Project Examples
3. Operation steps
1. After starting the container, click the API address to enter the Web interface
If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 1-2 minutes and refresh the page.
2. After entering the webpage, you can start a conversation with the model
How to use
Go to Tutorial to view the tutorial guide
Enter Customize Environment and select the environment you want to load
Enter Launch Rocket loading environment
Enter Specify Goal and select the target point and interaction method
Enter Launch Rocket's Setting Panel and select the model
Enter Launch Rocket's Control Panel to set the inference steps and perform inference
Repeat steps 4-6 until the inference process is complete. Then, enter the Record Video mode to create and download the video. The video cannot be played online.
4. Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓
Citation Information
The citation information for this project is as follows:
@article{cai2025rocket,
title={ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment},
author={Cai, Shaofei and Mu, Zhancun and Liu, Anji and Liang, Yitao},
journal={arXiv preprint arXiv:2503.02505},
year={2025}
}
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.