Bolin97/MedicalQA
收藏Hugging Face2025-01-18 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Bolin97/MedicalQA
下载链接
链接失效反馈官方服务:
资源简介:
MedicalQA-1.4M是一个集成的规模较大、质量较高的中文SFT数据集,旨在通过SFT或RAG方法将医疗知识注入到大型语言模型中。该数据集包含四个子集:DX、HT、TCM和MB,涵盖了西医和中医两种类型的医疗知识。DX子集包含来自丁香医生的53554个高质量的问答对;HT子集包括医学网络语料库、百科全书和书籍;TCM子集涉及中医知识;MB子集包含医学书籍。
MedicalQA is an integrated large-scale, high-quality Chinese SFT dataset designed for injecting medical knowledge into LLMs through SFT or RAG. It includes four subsets: DX, HT, TCM, and MB, covering two types of medical knowledge: Western Medicine and Traditional Chinese Medicine. The DX subset contains 53,554 high-quality Q&A pairs from the DingXiang doctor website; the HT subset includes medical web corpora, encyclopedias, and books; the TCM subset is related to Traditional Chinese Medicine knowledge; and the MB subset consists of medical books.
提供机构:
Bolin97



