HyperAI

MedCalc-Bench Medical Computing Dataset

Date

6 months ago

Size

16.04 MB

Organization

Publish URL

github.com

* This dataset supports online use.Click here to jump.

MedCalc-Bench is a dataset specifically designed to evaluate the medical computing capabilities of large language models (LLMs). It was jointly released in 2024 by nine institutions including the National Library of Medicine, National Institutes of Health and the University of Virginia. The relevant paper results are "MEDCALC-BENCH: Evaluating Large Language Models for Medical Calculations", has been accepted by NeurIPS 2024.

This dataset contains 10,055 training instances and 1,047 test instances, covering 55 different computational tasks. Each instance includes a patient's note, a question to calculate a specific clinical value, the final answer value, and a step-by-step solution. The purpose of MedCalc-Bench is to improve the linguistic and computational reasoning abilities of LLMs in medical settings.

The features of the dataset include line number, calculator ID, calculator name, category, output type, note ID, note type, patient note, question, related entity, true answer, lower bound, upper bound, and true explanation. These features provide the model with rich contextual information for accurate calculation and reasoning. The dataset is divided into training and test sets, which can be used to fine-tune LLMs to improve their performance in medical computing tasks.

MedCalc-Bench.torrent
Seeding 1Downloading 1Completed 64Total Downloads 126
  • MedCalc-Bench/
    • README.md
      1.94 KB
    • README.txt
      3.88 KB
      • data/
        • bench.zip
          16.04 MB