HyperAIHyperAI

Command Palette

Search for a command to run...

MedChatZH Chinese Medical Conversation Command Dataset

Date

2 months ago

Size

1.31 GB

Organization

East China University of Science and Technology

Publish URL

github.com

Paper URL

2309.01114

License

Apache 2.0

*This dataset supports online use.Click here to jump.

MedChatZH is a Chinese medical conversation dataset released by East China University of Science and Technology in 2023. The related paper results are "MedChatZH: A tuning LLM for traditional Chinese medicine consultations", which aims to improve the ability to understand and generate Chinese (especially in TCM scenarios) medical consultation dialogues through continuous pre-training of TCM classics and fine-tuning of medical instruction data.

The data comes from over 1,000 TCM classics and medical notes, as well as over 7 million Chinese medical instructions collected from the internet and multiple Chinese hospitals. Combined with the BELLE-3.5M general instructions, 763,629 medical instructions and 1,305,194 general instructions were screened and cleaned to form the med-mix-2M dataset for dialogue fine-tuning. Together with the TCM classics corpus, they serve the two stages of continued pre-training and instruction fine-tuning.

MedChatZH.torrent
Seeding 2Downloading 1Completed 20Total Downloads 119
  • MedChatZH/
    • README.md
      1.53 KB
    • README.txt
      3.05 KB
      • data/
        • MedChatZH.zip
          1.31 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MedChatZH Chinese Medical Conversation Command Dataset | Datasets | HyperAI