five

Bolin97/MedicalQA

收藏
Hugging Face2025-01-18 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Bolin97/MedicalQA
下载链接
链接失效反馈
官方服务:
资源简介:
MedicalQA-1.4M是一个集成的规模较大、质量较高的中文SFT数据集,旨在通过SFT或RAG方法将医疗知识注入到大型语言模型中。该数据集包含四个子集:DX、HT、TCM和MB,涵盖了西医和中医两种类型的医疗知识。DX子集包含来自丁香医生的53554个高质量的问答对;HT子集包括医学网络语料库、百科全书和书籍;TCM子集涉及中医知识;MB子集包含医学书籍。

MedicalQA is an integrated large-scale, high-quality Chinese SFT dataset designed for injecting medical knowledge into LLMs through SFT or RAG. It includes four subsets: DX, HT, TCM, and MB, covering two types of medical knowledge: Western Medicine and Traditional Chinese Medicine. The DX subset contains 53,554 high-quality Q&A pairs from the DingXiang doctor website; the HT subset includes medical web corpora, encyclopedias, and books; the TCM subset is related to Traditional Chinese Medicine knowledge; and the MB subset consists of medical books.
提供机构:
Bolin97
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作