mistral_medical_meadow_wikidoc_instruct_dataset
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Shekswess/mistral_medical_meadow_wikidoc_instruct_dataset
下载链接
链接失效反馈官方服务:
资源简介:
Dataset made for instruction supervised finetuning of Mistral LLMs based on the Medical meadow wikidoc dataset:
- Medical meadow wikidoc (https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md)
## Medical meadow wikidoc
The Medical Meadow Wikidoc dataset comprises question-answer pairs sourced from WikiDoc, an online platform where medical professionals collaboratively contribute and share contemporary medical knowledge. WikiDoc features two primary sections: the "Living Textbook" and "Patient Information". The "Living Textbook" encompasses chapters across various medical specialties, from which we extracted content. Utilizing GTP-3.5-Turbo, the paragraph headings are transformed into questions and utilized the respective paragraphs as answers. Notably, the structure of "Patient Information" is distinct; each section's subheading already serves as a question, eliminating the necessity for rephrasing.
本数据集为基于医疗园地维基文档(Medical Meadow Wikidoc)数据集构建的、用于Mistral系列大语言模型(Large Language Model,LLM)指令监督微调的数据集,其原始数据集链接为:https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md
## 医疗园地维基文档(Medical Meadow Wikidoc)
医疗园地维基文档数据集由来自WikiDoc的问答对构成。WikiDoc是一个由医疗专业人员协作贡献并分享当代医学知识的在线平台,设有两大核心板块:"活教材(Living Textbook)"与"患者信息(Patient Information)"。
我们从覆盖各类医学专科的"活教材"各章节中提取内容,使用GPT-3.5-Turbo将段落标题转换为问题,并以对应段落作为答案。值得注意的是,"患者信息"板块的结构存在差异:其各分区的子标题本身已作为问题存在,无需进行重述改写。
提供机构:
maas
创建时间:
2025-10-03



