five

llama2_medical_meadow_wikidoc_instruct_dataset

收藏
魔搭社区2025-11-27 更新2025-11-29 收录
下载链接:
https://modelscope.cn/datasets/Shekswess/llama2_medical_meadow_wikidoc_instruct_dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset made for instruction supervised finetuning of Llama 2 LLMs based on the Medical meadow wikidoc dataset: - Medical meadow wikidoc (https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md) ## Medical meadow wikidoc The Medical Meadow Wikidoc dataset comprises question-answer pairs sourced from WikiDoc, an online platform where medical professionals collaboratively contribute and share contemporary medical knowledge. WikiDoc features two primary sections: the "Living Textbook" and "Patient Information". The "Living Textbook" encompasses chapters across various medical specialties, from which we extracted content. Utilizing GTP-3.5-Turbo, the paragraph headings are transformed into questions and utilized the respective paragraphs as answers. Notably, the structure of "Patient Information" is distinct; each section's subheading already serves as a question, eliminating the necessity for rephrasing.

本数据集基于医学草地维基文档数据集(Medical Meadow Wikidoc)构建,用于Llama 2大语言模型(LLMs)的指令监督微调: - 医学草地维基文档数据集(Medical Meadow Wikidoc):https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md ## 医学草地维基文档数据集(Medical Meadow Wikidoc) 该数据集的样本均为问答对,数据源自WikiDoc——一个由医学专业人员协作贡献并分享现代医学知识的在线平台。WikiDoc包含两大核心板块:「活态教科书」(Living Textbook)与「患者信息」(Patient Information)。其中「活态教科书」板块涵盖了各医学专科的章节内容,我们从中提取了所需数据;我们通过GTP-3.5-Turbo将段落标题转换为问题,并以对应段落作为答案。值得注意的是,「患者信息」板块的结构存在差异:其每个小节的子标题本身即为问题,无需额外进行句式改写。
提供机构:
maas
创建时间:
2025-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作