Tongyi Qianwen 72B Chat Int4 Model Gradio Demo
Qwen-72B-Chat-Int4 demo
Model Introduction
Tongyi Qianwen-72B (Qwen-72B) is a 72 billion parameter model of the Tongyi Qianwen large model series developed by Alibaba Cloud. Qwen-72B is a large language model based on Transformer, trained on ultra-large-scale pre-training data. The pre-training data types are diverse and cover a wide range, including a large number of online texts, professional books, codes, etc. At the same time, based on Qwen-72B, the research team used the alignment mechanism to create Qwen-72B-Chat, an AI assistant based on a large language model. This repository is the repository of the Int4 quantization model of Qwen-72B-Chat.
One-click deployment
This tutorial is about running the Int4 quantized model of Tongyi Qianwen 72B Chat on OpenBayes.
How to run
- After the cloned container starts, open a new terminal page
- Enter the command python web_ui.py to run the Gradio demo
- Follow the prompts to open the link
- You can start talking to the model