Date

2 years ago

Size

20.69 MB

Organization

Publish URL

github.com

Paper URL

www.nature.com

License

CC BY-NC-SA 3.0

Tags

MMedBench is a comprehensive multilingual medical proficiency test benchmark dataset developed by the Smart Healthcare Team of the School of Artificial Intelligence of Shanghai Jiao Tong University in 2024.Towards building multilingual language model for medicine"It aims to evaluate the development of multilingual models in the medical field, covering 6 languages and 21 medical subfields. All questions in MMedBench are directly derived from medical examination question banks in various countries, ensuring the accuracy and reliability of the assessment and avoiding diagnostic understanding bias caused by differences in medical practice guidelines in different countries. The benchmark consists of two main evaluation dimensions: selection accuracy and explanation rationality. During the evaluation process, the model not only needs to select the correct answer, but also must provide a reasonable explanation, which further tests the model's ability to understand and interpret complex medical information. The data statistics of MMedBench show the basic numerical statistics of the training set and the test set, as well as the distribution of samples on different topics. The research team evaluated the mainstream medical language models on the MMedBench benchmark, including three test strategies: Zero-shot, PEFT Finetuning, and Full model Finetuning. The test results show that the proposed model surpasses the existing open source models of the same level in the two key dimensions of selection accuracy and explanation rationality, and is comparable to GPT-4. In addition, the research team also conducted a manual scoring evaluation. In the manual evaluation results, the proposed model was most preferred by human users. The launch of MMedBench not only promotes multilingual large-scale model research in the medical field, but also provides new tools for clinical practice, especially in solving language barriers and globalizing medical resources. All data and codes have been open sourced, further promoting cooperation and technology sharing in the global research community.

MMedBench data statistics. Figure a shows the basic numerical statistics of the MMedBench training set and test set; Figure b reveals the distribution of MMedBench samples on different topics.

MMedBench.torrent

Seeding 3Downloading 0Completed 177Total Downloads 435

MMedBench/
- README.md
  2.67 KB
- README.txt
  5.33 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

2 years ago

Size

20.69 MB

Organization

Publish URL

github.com

Paper URL

www.nature.com

License

CC BY-NC-SA 3.0

Related Datasets

CL-bench Context Learning Evaluation Benchmark Dataset

4 months ago

LightOnOCR-mix-0126 Text Transcription Dataset

5 months ago

Patient Churn Prediction Dataset

5 months ago

LongBench-Pro Long Context Comprehensive Evaluation Dataset

6 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

MMedBench Multilingual Medical Proficiency Test Benchmark Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

MMedBench Multilingual Medical Proficiency Test Benchmark Dataset

Related Datasets

CL-bench Context Learning Evaluation Benchmark Dataset

LightOnOCR-mix-0126 Text Transcription Dataset

Patient Churn Prediction Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

MMedBench Multilingual Medical Proficiency Test Benchmark Dataset

Related Datasets

CL-bench Context Learning Evaluation Benchmark Dataset

LightOnOCR-mix-0126 Text Transcription Dataset

Patient Churn Prediction Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

CL-bench Context Learning Evaluation Benchmark Dataset

LightOnOCR-mix-0126 Text Transcription Dataset

Patient Churn Prediction Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset

Related Datasets

CL-bench Context Learning Evaluation Benchmark Dataset

LightOnOCR-mix-0126 Text Transcription Dataset

Patient Churn Prediction Dataset

LongBench-Pro Long Context Comprehensive Evaluation Dataset