医疗健康知识图谱数据集
收藏山东省数据知识产权存证登记平台2025-01-24 更新2025-02-08 收录
下载链接:
https://sddip.com/djgg/publicDetails/7032de72bd414227a2d2387c8daf26f3
下载链接
链接失效反馈资源简介:
本数据集是由我司构建的一个医疗文本数据集,可用于大模型预训练、监督微调、检索增强等场景。该数据集由我司医学专家结合自身专业知识并参考公开权威的医学资料创作生产形成该产品。生产过程主要基于医学知识常用场景设计,参考了ICD-10、ICD-9-CM3等权威的国际医学标准及医学教材、行业标准等医学资料,采用智能化的知识提取与知识融合技术将医学知识提炼成三元组,并通过内部专业人员进行审核,通过知识构建、融合、审核等过程,形成一套覆盖临床、科研、药品信息等多领域的知识内容。为医疗研究、智能诊断、辅助决策提供坚实的数据支撑,加速医学领域的创新应用。数据集规模达到500万条实体,1000万条关系。
This medical text dataset was developed by our company, and is applicable to scenarios including Large Language Model (LLM) pre-training, supervised fine-tuning and retrieval-augmented applications. It was created and curated by our company's medical experts, who integrated their professional expertise and referred to publicly available authoritative medical materials. The development process was designed based on common medical knowledge scenarios, with reference to authoritative international medical standards such as ICD-10 and ICD-9-CM3, as well as medical textbooks and industry-standard medical resources. Intelligent knowledge extraction and fusion technologies were adopted to refine medical knowledge into knowledge triplets, which were then reviewed by internal professional personnel. Through the processes of knowledge construction, fusion and review, a comprehensive knowledge corpus covering multiple fields including clinical practice, scientific research and drug information has been formed. This dataset provides solid data support for medical research, intelligent diagnosis and auxiliary decision-making, and accelerates innovative applications in the medical field. The dataset contains 5 million entities and 10 million relationships in total.
提供机构:
北方健康医疗大数据科技有限公司
AI搜集汇总
数据集介绍

特点
医疗健康知识图谱数据集是一个由北方健康医疗大数据科技有限公司构建的高质量医疗文本数据集,包含500万条实体和1000万条关系,覆盖多个医学领域,适用于医疗研究、智能诊断等场景。
以上内容由AI搜集并总结生成



