HyperAI

Online Tutorial: Deploy Large Models Without Any Pressure! Run Llama 3.1 405B and Mistral Large 2 With One Click

特色图像

On July 23rd local time, Meta officially released Llama 3.1. The oversized 405B parameter version strongly opened the highlight moment of the open source model. In multiple benchmark tests, its performance caught up with or even surpassed the existing SOTA models GPT-4o and Claude 3.5 Sonnet.

The alt attribute of this image is empty; the file name is ikc2etci.png

Zuckerberg also wrote a long article titled "Open Source AI is the Way Forward" on the day of Llama 3.1's release, saying that Llama 3.1 will be a turning point for the industry. At the same time, the industry is eager to try out the powerful capabilities of Llama 3.1, and is also looking forward to how closed-source big models will respond.

Interestingly, just as Llama 3.1 was vying for the throne, Mistral AI launched Mistral Large 2 to directly confront the 405B model, which is difficult to deploy.

Undoubtedly, the hardware capabilities required for the 405B parameter scale are not a threshold that individual developers can easily cross, and most enthusiasts can only look on with daunting eyes. The Mistral Large 2 model has only 123B parameters, less than one-third of the Llama 3.1 405B, and the deployment threshold is also lowered, but the performance can "compete" with Llama 3.1.

For example, in the MultiPL-E multiple programming language benchmark, Mistral Large 2's average score surpassed Llama 3.1 405B, and was 1% behind GPT-4o, and surpassed Llama 3.1 405B in Python, C++, Java, etc. As its official statement, Mistral Large 2 has opened up a new frontier in performance/service cost of evaluation indicators.

The alt attribute of this image is empty; the file name is rj_mufl.jpg

On one hand, there is the current "ceiling" of the open source model parameter scale, and on the other hand, there is the leader of the new era of open source with super high "cost-effectiveness". I believe that everyone does not want to miss it! Don't worry, HyperAI Super Neural has launched a one-click deployment tutorial for Llama 3.1 405B and Mistral Large 2407. You don't need to enter any commands, just click "Clone" to experience it.

* Use Open WebUI to deploy the Llama 3.1 405B model in one click:

https://go.hyper.ai/iyL60

* Use Open WebUI to deploy Mistral Large 2407 123B in one click:

https://go.hyper.ai/Bwf6G

At the same time, we have also prepared advanced tutorials, you can choose as needed:

* One-click deployment of Llama 3.1 405B model OpenAI compatible API service:

https://go.hyper.ai/1AiDi

* One-click deployment of Mistral Large 2407 123B model OpenAI compatible API service:

https://go.hyper.ai/Smexo

I used Open WebUI to deploy Mistral Large 2407 123B with one click and conducted a test. The large models frequently failed to meet the "9.9 or 9.11 which is bigger" problem, and Mistral Large 2 was not immune to this problem:

The alt attribute of this image is empty; the file name is svnnux8.jpg

Interested friends, come and experience it, the detailed tutorial is as follows⬇️

Demo Run

This text tutorial will take "Use Open WebUI to deploy Mistral Large 2407 123B in one click" and "Deploy Llama 3.1 405B model OpenAI compatible API service in one click" as examples to break down the operation steps for you.


Use Open WebUI to deploy Mistral Large 2407 123B in one click


1. Log in to hyper.ai, on the Tutorial page, select Deploy Mistral Large 2407 123B with Open WebUI, and click Run this tutorial online.

The alt attribute of this image is empty; the file name is ngh6_pzb.jpg

2. After the page jumps, click "Clone" in the upper right corner to clone the tutorial into your own container.

The alt attribute of this image is empty; the file name is lft_br3t.jpg

3. Click "Next: Select Hashrate" in the lower right corner.

The alt attribute of this image is empty; the file name is qit_27vx.jpg

4. After the page jumps, select "NVIDIA RTX A6000-2" and "vllm" image, and click "Next: Review".New users can register using the invitation link below to get 4 hours of RTX 4090 + 5 hours of CPU free time!

HyperAI exclusive invitation link (copy and open in browser):

https://openbayes.com/console/signup?r=6bJ0ljLFsFh_Vvej

The alt attribute of this image is empty; the file name is xmpw0oqz.jpg

5. After confirmation, click "Continue" and wait for resources to be allocated. The first cloning takes about 2 minutes. When the status changes to "Running", click the jump arrow next to "API Address" to jump to the Demo page.Please note that users must complete real-name authentication before using the API address access function.

If the issue persists for more than 10 minutes and remains in the "Allocating resources" state, try stopping and restarting the container. If restarting still does not resolve the issue, please contact the platform customer service on the official website.

The alt attribute of this image is empty; the file name is acvwcar.jpg
The alt attribute of this image is empty; the file name is lxgw2fto.jpg

6. After opening the Demo, you can start the conversation immediately.

The alt attribute of this image is empty; the file name is 17i10vz1.jpg
The alt attribute of this image is empty; the file name is 5tz0wt22.jpg

One-click deployment of Llama 3.1 405B model OpenAI compatible API service

1. If you want to deploy OpenAI compatible API service, select "One-click deployment of Llama 3.1 405B model OpenAI compatible API service" on the tutorial interface. Similarly, click "Run tutorial online"

The alt attribute of this image is empty; the file name is 1233nog1.jpg
The alt attribute of this image is empty; the file name is oykt3g3.jpg

2. After the page jumps, click "Clone" in the upper right corner to clone the tutorial into your own container.

The alt attribute of this image is empty; the file name is qkllhzc3.jpg

3. Click "Next: Select Hashrate" in the lower right corner.

This image has an empty alt attribute; the file name is d9npirt.jpg

4. After the page jumps, because the model is large, the computing resource needs to select "NVIDIA RTX A6000-8", and the image still selects "vllm". Click "Next: Review".

The alt attribute of this image is empty; the file name is 1utxh0a.jpg

5. After confirmation, click "Continue" and wait for resources to be allocated. The first cloning takes about 6 minutes. When the status shows "Running", the model will automatically start loading.

The alt attribute of this image is empty; the file name is a749vvd5.jpg
The alt attribute of this image is empty; the file name is lwtnsehi.jpg

6. Scroll to the bottom of the page. When the log shows the following routing information, it means the service has been started successfully. Open the API address.

The alt attribute of this image is empty; the file name is 0wb4osyd.jpg
The alt attribute of this image is empty; the file name is 2h54czbd.jpg

7. After opening, the 404 information will be displayed by default. Adding an additional parameter "/v1/models" in the red box will display the deployment information of the current model.

The alt attribute of this image is empty; the file name is q8hn09e.jpg
The alt attribute of this image is empty; the file name is 1he9kty0.jpg
The alt attribute of this image is empty; the file name is t79cqbn.jpg
The alt attribute of this image is empty; the file name is 3zf0aiiv.jpg

8. Start an Open WebUI service locally, start an additional connection in "External Connections", fill in the previous API address in "OpenAPI" and ➕ "/v1", and there is no "API key" set here. Click Save in the lower right corner.

The alt attribute of this image is empty; the file name is jfffio9b.jpg

9. After saving, you can see Llama-3.1-405B appearing in "Select Model". After selecting the model, you can start the conversation!

The alt attribute of this image is empty; the file name is 8ejs95gb.jpg
The alt attribute of this image is empty; the file name is 4r_tty1.jpg

Finally, I recommend an online academic sharing activity. Interested friends can scan the QR code to participate!

The alt attribute of this image is empty; the file name is hx9sr_me.jpg