MedReason
收藏魔搭社区2026-01-06 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/UCSC-VLAA/MedReason
下载链接
链接失效反馈官方服务:
资源简介:
# MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
<p align="center">
📃 <a href="https://huggingface.co/papers/2504.00993" target="_blank">Paper</a> |🤗 <a href="https://huggingface.co/UCSC-VLAA/MedReason-8B" target="_blank">MedReason-8B</a> | 📚 <a href="https://huggingface.co/datasets/UCSC-VLAA/MedReason" target="_blank">MedReason Data</a>
</p>
## ✨ Latest News
- [05/27/2025] 🎉 MedReason wins 3rd prize🏆 in the [Huggingface Reasoning Datasets Competition](https://x.com/bespokelabsai/status/1910068013661118874)!
## ⚡Introduction
**MedReason** is a large-scale high-quality medical reasoning dataset designed to enable faithful and explainable medical problem-solving in large language models (LLMs).
- We utilize a structured medical knowledge graph (KG) to convert clinical QA pairs into logical chains of reasoning, or “thinking paths”.
- Our pipeline generates detailed reasoning for various medical questions from 7 medical datasets, resulting in a dataset of **32,682** question-answer pairs, each with detailed, step-by-step explanations.
- By finetuning with proposed [MedReason dataset](https://huggingface.co/datasets/UCSC-VLAA/MedReason), our best model [MedReason-8B](https://huggingface.co/UCSC-VLAA/MedReason-8B), achieves *state-of-the-art* performance.
We open-sourced our CoT dataset here.
## 🙏🏼 Acknowledgement
We gratefully acknowledge the inspiring work of [HuatuoGPT-o1](https://github.com/FreedomIntelligence/HuatuoGPT-o1), which laid important groundwork for this research. We also thank the developers of the excellent tools [curator](https://github.com/bespokelabsai/curator/), [trl](https://github.com/huggingface/trl), and [sglang](https://github.com/sgl-project/sglang) for making this work possible.
## 📖 Citation
```
@misc{wu2025medreasonelicitingfactualmedical,
title={MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs},
author={Juncheng Wu and Wenlong Deng and Xingxuan Li and Sheng Liu and Taomian Mi and Yifan Peng and Ziyang Xu and Yi Liu and Hyunjin Cho and Chang-In Choi and Yihan Cao and Hui Ren and Xiang Li and Xiaoxiao Li and Yuyin Zhou},
year={2025},
eprint={2504.00993},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2504.00993},
}
```
# MedReason:基于知识图谱激发大语言模型的事实性医疗推理步骤
<p align="center">
📃 <a href="https://huggingface.co/papers/2504.00993" target="_blank">论文</a> |🤗 <a href="https://huggingface.co/UCSC-VLAA/MedReason-8B" target="_blank">MedReason-8B</a> | 📚 <a href="https://huggingface.co/datasets/UCSC-VLAA/MedReason" target="_blank">MedReason数据集</a>
</p>
## ✨ 最新动态
- [2025/05/27] 🎉 MedReason在[Huggingface推理数据集竞赛](https://x.com/bespokelabsai/status/1910068013661118874)中斩获三等奖🏆!
## ⚡ 引言
**MedReason**是一款大规模高质量医疗推理数据集,旨在赋能大语言模型(Large Language Model,LLM)实现可信且可解释的医疗问题求解。
- 我们采用结构化医疗知识图谱(Knowledge Graph,KG)将临床问答对转换为逻辑推理链,即“思考路径”。
- 我们的流水线从7个医疗数据集中的各类医疗问题生成详细推理过程,最终构建了包含**32682**条问答对的数据集,每条问答对均配有详尽的分步解释。
- 通过使用本文提出的[MedReason数据集](https://huggingface.co/datasets/UCSC-VLAA/MedReason)进行微调,我们的最优模型[MedReason-8B](https://huggingface.co/UCSC-VLAA/MedReason-8B)实现了当前最优性能。
我们在此开源了我们的思维链(Chain of Thought,CoT)数据集。
## 🙏🏼 致谢
我们衷心感谢[HuatuoGPT-o1](https://github.com/FreedomIntelligence/HuatuoGPT-o1)的开创性工作,其为本研究奠定了重要基础。同时,我们也感谢优秀工具[curator](https://github.com/bespokelabsai/curator/)、[trl](https://github.com/huggingface/trl)与[sglang](https://github.com/sgl-project/sglang)的开发者,使本研究得以顺利完成。
## 📖 引用格式
@misc{wu2025medreasonelicitingfactualmedical,
title={MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs},
author={Juncheng Wu and Wenlong Deng and Xingxuan Li and Sheng Liu and Taomian Mi and Yifan Peng and Ziyang Xu and Yi Liu and Hyunjin Cho and Chang-In Choi and Yihan Cao and Hui Ren and Xiang Li and Xiaoxiao Li and Yuyin Zhou},
year={2025},
eprint={2504.00993},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2504.00993},
}
提供机构:
maas
创建时间:
2025-04-21



