HyperAI

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Yu Sun, Xingyu Qian, Weiwen Xu, Hao Zhang, Chenghao Xiao, Long Li, Yu Rong, Wenbing Huang, Qifeng Bai, Tingyang Xu
Date de publication: 6/15/2025
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical
  Reasoning
Résumé

Though reasoning-based large language models (LLMs) have excelled inmathematics and programming, their capabilities in knowledge-intensive medicalquestion answering remain underexplored. To address this, we introduceReasonMed, the largest medical reasoning dataset, comprising 370k high-qualityexamples distilled from 1.7 million initial reasoning paths generated byvarious LLMs. ReasonMed is constructed through a multi-agentverification and refinement process, where we design an Error Refinerto enhance the reasoning paths by identifying and correcting error-prone stepsflagged by a verifier. Leveraging ReasonMed, we systematically investigate bestpractices for training medical reasoning models and find that combiningdetailed Chain-of-Thought (CoT) reasoning with concise answer summaries yieldsthe most effective fine-tuning strategy. Based on this strategy, we trainReasonMed-7B, which sets a new benchmark for sub-10B models, outperforming theprior best by 4.17\% and even exceeding LLaMA3.1-70B on PubMedQA by 4.60\%.