HyperAI

QwQ-32B One-click Deployment Tutorial Is Online, Performance Is Comparable to the Full-capability Version of DeepSeek-R1

特色图像

Yesterday, Alibaba Cloud suddenly made a big move and open-sourced a new reasoning model, Tongyi Qianwen QwQ-32B.On multiple key benchmarks, it surpassed OpenAI-o1-mini with 32B parameters and was comparable to the full-blooded version of DeepSeek-R1 with 671B parameters. QwQ-32B not only has amazing performance, but also significantly reduces the cost of deployment while maintaining strong performance. It can also be deployed locally on consumer-grade graphics cards, making it a model of strength and cost-effectiveness.

QwQ-32B scores compared with DeepSeek-R1-671B and other inference models in multiple benchmarks

On the technical level, QwQ-32B adopts a two-stage reinforcement learning method based on cold start. The first stage focuses on mathematics and code tasks, and uses mathematical verifiers and code sandboxes to focus on improving the model's logical reasoning ability.

In the second phase, the answer verification mechanism is used to replace the traditional reward model. For mathematical problems, feedback is given based on the correctness of the results. For programming tasks, real-time evaluation is performed on the server through test cases to improve general capabilities. In addition, QwQ-32B also integrates Agent-related functions, enabling it to flexibly adjust the reasoning process based on environmental feedback, significantly enhancing the autonomy and adaptability of the model.

"Using vLLM to deploy QwQ-32B" is now available in the "Tutorials" section of HyperAI's official website.Small parameters and great power, waiting for you to verify!

Tutorial address:

https://go.hyper.ai/1YmGY

Demo Run

1. Log in to hyper.ai, on the Tutorial page, select Deploy QwQ-32B using vLLM, and click Run this tutorial online.

2. After the page jumps, click "Clone" in the upper right corner to clone the tutorial into your own container.

3. Select "NVIDIA A6000-2" and "vllm" images. The OpenBayes platform has launched a new billing method. You can choose "pay as you go" or "daily/weekly/monthly" according to your needs. Click "Continue". New users can register using the invitation link below to get 4 hours of RTX 4090 + 5 hours of CPU free time!

HyperAI exclusive invitation link (copy and open in browser):

https://openbayes.com/console/signup?r=Ada0322_NR0n

4. Wait for resources to be allocated. The first clone will take about 2 minutes. When the status changes to "Running", click the jump arrow next to "API Address" to jump to the Demo page. Please note that users must complete real-name authentication before using the API address access function.

Effect display

1. There is a lot of discussion online about which one is better, QwQ-32B or DeepSeek. Let's ask QwQ-32B and see how it answers.

2. It can be seen that QwQ-32B will demonstrate a complete thinking process and objectively give analysis from multiple angles.