HyperAI

Tongyi Qianwen 72B Chat Int4 Model Gradio Demo

Qwen-72B-Chat-Int4 demo

Model Introduction

Tongyi Qianwen-72B (Qwen-72B) is a 72 billion parameter model of the Tongyi Qianwen large model series developed by Alibaba Cloud. Qwen-72B is a large language model based on Transformer, trained on ultra-large-scale pre-training data. The pre-training data types are diverse and cover a wide range, including a large number of online texts, professional books, codes, etc. At the same time, based on Qwen-72B, the research team used the alignment mechanism to create Qwen-72B-Chat, an AI assistant based on a large language model. This repository is the repository of the Int4 quantization model of Qwen-72B-Chat. 1

One-click deployment

This tutorial is about running the Int4 quantized model of Tongyi Qianwen 72B Chat on OpenBayes.

How to run

  1. After the cloned container starts, open a new terminal page 2
  2. Enter the command python web_ui.py to run the Gradio demo 3
  3. Follow the prompts to open the link 4
  4. You can start talking to the model 5