five

河南省乳腺癌分类辅助诊断模型训练数据

收藏
浙江省数据知识产权登记平台2024-01-13 更新2024-05-08 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/27350
下载链接
链接失效反馈
官方服务:
资源简介:
通过对样本的数据处理和数据加工,提供给辅助诊断人工智能模型进行训练,帮助人工智能模型更好地理解河南省样本场景下将乳腺癌分型,提取特征,发现规律,最终提高诊断人工智能模型的准确性、鲁棒性和泛化能力。1数据采集:通过正式合作协议,从医疗机构取得匿名化的样本临床数据,包括是否有术后病理结果、术后ER(雌性激素受体)、术后PR(孕激素受体)、术后Her2情况、术后Fish情况,同时还要获取系统内术后Her2阴性阳性分型标记;2数据处理:对数据进行检查核对,确保所有数据去标志化,处于完全匿名化状态且不可还原的状态,将没有病理结果的数据去除,对异常数据进行清洗去除,对部分缺失数据进行生成式补充;3数据加工:基于原始数据和算法规则,生成乳腺癌分类标记,具体判规则为:如果ER满足阳性,同时PR满足阳性,同时HER-2满足阴性,则标记为Luminal型,反之则标记为其他类型。

This dataset is processed and curated from patient samples, and is provided for training auxiliary diagnostic AI models. It aims to help the AI models better perform breast cancer subtyping in the context of Henan Province patient samples, extract features, discover underlying patterns, and ultimately improve the accuracy, robustness and generalization ability of the diagnostic AI models. 1. Data Collection: Through formal cooperation agreements, anonymized clinical data of patient samples are obtained from medical institutions. The collected data includes postoperative pathological results, postoperative ER (Estrogen Receptor), postoperative PR (Progesterone Receptor), postoperative HER2 status, postoperative FISH results, as well as the in-system postoperative HER2 negative/positive subtyping labels. 2. Data Processing: Data is inspected and verified to ensure that all data is de-identified, fully anonymized and irreversibly unlinkable. Data without pathological results is removed, abnormal data is cleaned and eliminated, and partially missing data is complemented via generative approaches. 3. Data Curation: Breast cancer classification labels are generated based on the original data and algorithmic rules. The specific judgment rules are as follows: If ER is positive, PR is positive, and HER2 is negative, the sample is labeled as Luminal subtype; otherwise, it is labeled as other types.
提供机构:
杭州智圆惠方科技有限公司
创建时间:
2023-12-29
搜集汇总
数据集介绍
main_image_url
特点
该数据集为河南省乳腺癌分类辅助诊断模型训练数据,包含150条记录,每年更新一次,数据来源于企业数据,涵盖多个术后病理相关字段。数据经过匿名化处理,并通过特定算法规则生成乳腺癌分类标记,用于提升诊断模型的性能。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作