five

mistral_medical_meadow_wikidoc_instruct_dataset

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/Shekswess/mistral_medical_meadow_wikidoc_instruct_dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset made for instruction supervised finetuning of Mistral LLMs based on the Medical meadow wikidoc dataset: - Medical meadow wikidoc (https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md) ## Medical meadow wikidoc The Medical Meadow Wikidoc dataset comprises question-answer pairs sourced from WikiDoc, an online platform where medical professionals collaboratively contribute and share contemporary medical knowledge. WikiDoc features two primary sections: the "Living Textbook" and "Patient Information". The "Living Textbook" encompasses chapters across various medical specialties, from which we extracted content. Utilizing GTP-3.5-Turbo, the paragraph headings are transformed into questions and utilized the respective paragraphs as answers. Notably, the structure of "Patient Information" is distinct; each section's subheading already serves as a question, eliminating the necessity for rephrasing.

本数据集为基于医疗园地维基文档(Medical Meadow Wikidoc)数据集构建的、用于Mistral系列大语言模型(Large Language Model,LLM)指令监督微调的数据集,其原始数据集链接为:https://huggingface.co/datasets/medalpaca/medical_meadow_wikidoc/blob/main/README.md ## 医疗园地维基文档(Medical Meadow Wikidoc) 医疗园地维基文档数据集由来自WikiDoc的问答对构成。WikiDoc是一个由医疗专业人员协作贡献并分享当代医学知识的在线平台,设有两大核心板块:"活教材(Living Textbook)"与"患者信息(Patient Information)"。 我们从覆盖各类医学专科的"活教材"各章节中提取内容,使用GPT-3.5-Turbo将段落标题转换为问题,并以对应段落作为答案。值得注意的是,"患者信息"板块的结构存在差异:其各分区的子标题本身已作为问题存在,无需进行重述改写。
提供机构:
maas
创建时间:
2025-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作