PubMed多模型医学推理数据集
收藏魔搭社区2026-05-22 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/zpeng1989/PubMed_MultiModel_Medical_Reasoning_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
本数据集来自 PubMed,共包含 940 个 case。对于同一病例,使用多款推理/诊断模型(deepseek-r1、o3-mini、gemini2-ft、qwq、baichuan-m1、diagnosegpt、medgemma)分别进行多轮或自由轮次的问答推理,输出分为 reasoning(推理过程 / 中间思路)和 answer(最终结论 / 建议)。数据用于模型性能对比、推理可解释性分析与临床问答研究(仅作科研与开发用途,不作为临床诊断依据)。给数据原始出处来自HuggingFace[Henrychur/MedRbench-Inference-Results],这个数据集主要将其中内容翻译成中文内容,方便使用。
This dataset is sourced from PubMed and contains a total of 940 clinical cases. For each individual case, multiple inference/diagnostic models (deepseek-r1, o3-mini, gemini2-ft, qwq, baichuan-m1, diagnosegpt, medgemma) were used to conduct multi-round or open-ended question-answering and reasoning separately. The model outputs are divided into two parts: reasoning (reasoning process / intermediate thinking steps) and answer (final conclusion / clinical recommendations). This dataset is intended for model performance comparison, reasoning interpretability analysis and clinical question-answering research, and is for scientific research and development purposes only, not to be used as a basis for clinical diagnosis. The original source of the core content of this dataset is Hugging Face [Henrychur/MedRbench-Inference-Results]; this specific version primarily translates the original content into Chinese for easier usage.
提供机构:
maas
创建时间:
2025-10-30
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集源自PubMed,包含940个医学案例,每个案例记录了多个推理模型(如deepseek-r1、o3-mini等)的对话输出,其中包含推理过程和最终答案。数据用于模型性能比较和医学推理研究,但仅限研发用途,不应用于临床诊断。
以上内容由遇见数据集搜集并总结生成



