five

恶性肿瘤通用数据集

收藏
山东省数据知识产权存证登记平台2025-05-29 更新2025-06-13 收录
下载链接:
https://sddip.com/djgg/publicDetails/05ce3cf0694c4ef881f0b07ea6d309fc
下载链接
链接失效反馈
官方服务:
资源简介:
本数据集是由我司构建的一个医疗文本数据集,围绕肿瘤临床诊疗与科研需求构建的标准化数据集合,通常涵盖病理学、影像学、基因组学、治疗方案及预后随访等多维度信息,适用于医学研究、临床决策支持及AI模型开发等领域。该数据集结构设计参考了多个权威来源,包含相关临床指南、教材和专家共识;聚合来自真实医疗数据,包含病案首页信息、电子病历(EMR)、医学检查、基因等异构数据;遵循国际医学标准,值域采用统一编码(ICD-O、LOINC等),确保数据兼容性与可交换性;经专业医学团队进行规则设计,保证提取内容的可靠性;进行数据脱敏,保障患者隐私安全;数据集共涵盖数据结构项678项,10万+真实病例。此类数据集可以显著提升肿瘤研究的效率,助力医疗大模型的开发。

This medical text dataset was constructed by our company, which is a standardized collection tailored for oncology clinical diagnosis, treatment and scientific research needs. It typically covers multi-dimensional data spanning pathology, imaging, genomics, treatment plans, prognostic follow-up and other relevant fields, and is applicable to medical research, clinical decision support, AI model development and other related domains. The structural design of this dataset draws on multiple authoritative sources including relevant clinical guidelines, textbooks and expert consensuses. It aggregates real-world clinical data, encompassing heterogeneous datasets such as medical record homepage information, electronic medical records (EMR), medical examination results and genetic data. It adheres to international medical standards, with uniform coding systems (e.g., ICD-O, LOINC) applied to value domains to ensure data compatibility and interoperability. Extraction rules are formulated by a professional medical team to guarantee the reliability of the included content. Patient privacy and data security are protected via data desensitization procedures. The dataset contains a total of 678 data structure entries and over 100,000 real patient cases. Such datasets can significantly enhance the efficiency of oncology research and support the development of medical large language models (LLMs).
提供机构:
北方健康医疗大数据科技有限公司
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个专注于恶性肿瘤的医疗文本数据集,由北方健康医疗大数据科技有限公司构建,包含超过10万真实病例和678项数据结构项,涵盖病理、影像、基因组学等多维度标准化信息。它遵循国际医学编码标准,经过数据脱敏处理以确保隐私安全,适用于临床诊疗支持、医学研究、人工智能模型开发及公共卫生分析等多种场景。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作