five

synthetic-medical-conversations-deepseek-v3

收藏
魔搭社区2025-11-15 更新2025-02-01 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/synthetic-medical-conversations-deepseek-v3
下载链接
链接失效反馈
官方服务:
资源简介:
# 🍎 Synthetic Multipersona Doctor Patient Conversations. Author: Nisten Tahiraj License: MIT # 🧠 Generated by [DeepSeek V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) running in full BF16. ### 🛠️ Done in a way that includes induced errors/obfuscations by the AI patients and friendly rebutals and corrected diagnosis from the AI doctors. This makes the dataset very useful as both training data and retrival systems for reducing hallucinations and increasing the diagnosis quality. >### 🐧 Conversations generated in the Following languages >_ > >English > >Chinese > >Japanese > >Danish > >German > >French > >_ More languages coming :) Follow our org lead by [Doctor @JohnsonThomasMD](https://x.com/JohnsonThomasMD) for more updates, DeepSeek R1 generations and a new mobile opensource medical model are in the works too 🚀 . ### The following disease list was used as seed for each synthetic convo: [nisten/all-human-diseases](https://huggingface.co/datasets/nisten/all-human-diseases) # DISCLAIMER: These are not human conversations. These were not corrected by a human at all. These are all straight from the AI. Before the data was generated the medical performance of the LLM was measured to be significantly higher than even Google's MedPalm 2. Reference: MedPalm two scores no higher than 72% https://paperswithcode.com/sota/multiple-choice-question-answering-mcqa-on-21 Despite the driver issues, deepseek v3 instruct has stellar scores in medical benmarking, here running in fp8_w8a8 on 8x AMD Mi300x card the multimedqa bench. Little to no difference was observed in medical benchmarking in bfloat16 vs 8bit. However other tests showed some divergence: https://x.com/nisten/status/1874996106540503367 ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/_9FbIF6xwu1WAuPLoI4Ri.jpeg) Yes, raw deepseek v3 with no special prompting scores 79% vs only 72% for the complicated CoT MedPalm2 API setup. The newer DeepSeek R1 has not yet been tested. Feel free to leave comments, concerns, and even contribute more data to open science. ## Thank you https://www.vultr.com/ for sponsoring the compute. ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/6ES2lgfQav9u6mfI_aVsz.jpeg)

# 🍎 合成多角色医患对话数据集(Synthetic Multipersona Doctor Patient Conversations) 作者:Nisten Tahiraj 许可证:MIT协议 # 🧠 由全BF16精度运行的[DeepSeek V3](https://huggingface.co/deepseek-ai/DeepSeek-V3)生成 ### 🛠️ 本次生成加入了AI患者诱导产生的错误表述与语义混淆内容、合理的反驳,以及AI医生的校正诊断,这使得该数据集既可作为训练数据,也可用于检索系统,以减少AI幻觉并提升诊断质量。 >### 🐧 生成对话所使用的语种如下 >_ > >英语 > >汉语 > >日语 > >丹麦语 > >德语 > >法语 > >_ 更多语种即将上线 :) 请关注由[约翰逊·托马斯医学博士(Doctor @JohnsonThomasMD)](https://x.com/JohnsonThomasMD)运营的组织账号以获取最新动态,DeepSeek R1生成内容与一款全新开源移动医疗模型的研发工作也在进行中 🚀。 ### 每项合成对话均以以下疾病列表作为种子数据集:[nisten/all-human-diseases](https://huggingface.co/datasets/nisten/all-human-diseases) # 免责声明:本数据集并非人类真实对话,未经过任何人工校正,所有内容均直接来自AI生成。 在生成数据前,我们已测试该大语言模型(Large Language Model, LLM)的医疗性能,其表现甚至显著优于谷歌的MedPalm 2。 参考资料:MedPalm 2的得分不超过72% https://paperswithcode.com/sota/multiple-choice-question-answering-mcqa-on-21 尽管存在驱动程序问题,但DeepSeek V3 Instruct在医疗基准测试中表现优异,本次在8块AMD Mi300x显卡上以fp8_w8a8精度运行了MultiMedQA基准测试。测试观察到,bfloat16精度与8bit精度在医疗基准测试中的表现几乎没有差异。不过其他测试显示存在一定分歧:https://x.com/nisten/status/1874996106540503367 ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/_9FbIF6xwu1WAuPLoI4Ri.jpeg) 是的,未经过特殊提示工程的原始DeepSeek V3得分可达79%,而采用复杂思维链(Chain of Thought, CoT)的MedPalm 2 API方案得分仅为72%。 全新的DeepSeek R1尚未完成测试。 欢迎各位留下评论、反馈意见,甚至为开放科学贡献更多数据。 ## 感谢https://www.vultr.com/ 为算力提供赞助。![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6379683a81c1783a4a2ddba8/6ES2lgfQav9u6mfI_aVsz.jpeg)
提供机构:
maas
创建时间:
2025-01-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作