llama2_medical_meadow_wikidoc_instruct_dataset
收藏魔搭社区2025-11-27 更新2025-11-29 收录
下载链接:
https://modelscope.cn/datasets/Shekswess/llama2_medical_meadow_wikidoc_instruct_dataset
下载链接
链接失效反馈官方服务:
资源简介:
Dataset made for instruction supervised finetuning of Llama 2 LLMs based on the Medical meadow wikidoc dataset:
- Medical meadow wikidoc (https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md)
## Medical meadow wikidoc
The Medical Meadow Wikidoc dataset comprises question-answer pairs sourced from WikiDoc, an online platform where medical professionals collaboratively contribute and share contemporary medical knowledge. WikiDoc features two primary sections: the "Living Textbook" and "Patient Information". The "Living Textbook" encompasses chapters across various medical specialties, from which we extracted content. Utilizing GTP-3.5-Turbo, the paragraph headings are transformed into questions and utilized the respective paragraphs as answers. Notably, the structure of "Patient Information" is distinct; each section's subheading already serves as a question, eliminating the necessity for rephrasing.
本数据集基于医学草地维基文档数据集(Medical Meadow Wikidoc)构建,用于Llama 2大语言模型(LLMs)的指令监督微调:
- 医学草地维基文档数据集(Medical Meadow Wikidoc):https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md
## 医学草地维基文档数据集(Medical Meadow Wikidoc)
该数据集的样本均为问答对,数据源自WikiDoc——一个由医学专业人员协作贡献并分享现代医学知识的在线平台。WikiDoc包含两大核心板块:「活态教科书」(Living Textbook)与「患者信息」(Patient Information)。其中「活态教科书」板块涵盖了各医学专科的章节内容,我们从中提取了所需数据;我们通过GTP-3.5-Turbo将段落标题转换为问题,并以对应段落作为答案。值得注意的是,「患者信息」板块的结构存在差异:其每个小节的子标题本身即为问题,无需额外进行句式改写。
提供机构:
maas
创建时间:
2025-10-03



