HyperAI

MMedBench Multilingual Medical Proficiency Test Benchmark Dataset

Date

7 months ago

Size

20.69 MB

Organization

Shanghai Jiao Tong University

Publish URL

github.com

License

CC BY-NC-SA 3.0

* This dataset supports online use.Click here to jump.

MMedBench is a comprehensive multilingual medical proficiency test benchmark dataset developed by the Smart Healthcare Team of the School of Artificial Intelligence of Shanghai Jiao Tong University in 2024.Towards building multilingual language model for medicine"It aims to evaluate the development of multilingual models in the medical field, covering 6 languages and 21 medical subfields. All questions in MMedBench are directly derived from medical examination question banks in various countries, ensuring the accuracy and reliability of the assessment and avoiding diagnostic understanding bias caused by differences in medical practice guidelines in different countries.

The benchmark consists of two main evaluation dimensions: selection accuracy and explanation rationality. During the evaluation process, the model not only needs to select the correct answer, but also must provide a reasonable explanation, which further tests the model's ability to understand and interpret complex medical information. The data statistics of MMedBench show the basic numerical statistics of the training set and the test set, as well as the distribution of samples on different topics.

The research team evaluated the mainstream medical language models on the MMedBench benchmark, including three test strategies: Zero-shot, PEFT Finetuning, and Full model Finetuning. The test results show that the proposed model surpasses the existing open source models of the same level in the two key dimensions of selection accuracy and explanation rationality, and is comparable to GPT-4. In addition, the research team also conducted a manual scoring evaluation. In the manual evaluation results, the proposed model was most preferred by human users.

The launch of MMedBench not only promotes multilingual large-scale model research in the medical field, but also provides new tools for clinical practice, especially in solving language barriers and globalizing medical resources. All data and codes have been open sourced, further promoting cooperation and technology sharing in the global research community.

MMedBench data statistics. Figure a shows the basic numerical statistics of the MMedBench training set and test set; Figure b reveals the distribution of MMedBench samples on different topics.

MMedBench.torrent
Seeding 1Downloading 1Completed 74Total Downloads 179
  • MMedBench/
    • README.md
      2.67 KB
    • README.txt
      5.33 KB
      • data/
        • MMedBench.zip
          20.69 MB