江苏省HER2型乳腺癌辅助诊断模型训练数据
收藏浙江省数据知识产权登记平台2024-01-12 更新2024-05-08 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/27146
下载链接
链接失效反馈官方服务:
资源简介:
通过对样本的数据处理和数据加工,提供给辅助诊断人工智能模型进行训练,帮助人工智能模型更好地理解江苏省样本场景下HER2型乳腺癌的情况,提取特征,发现规律,最终提高诊断人工智能模型的准确性、鲁棒性和泛化能力。1数据采集:通过正式合作协议,从医疗机构取得匿名化的样本临床数据,包括是否有术后病理结果,术后Her2情况,术后fish情况;2数据处理:对数据进行检查核对,确保所有数据去标识化,处于完全匿名化状态且不可还原的状态,将没有病理结果的数据去除,对异常数据进行清洗去除,对部分缺失数据进行生成式补充;3数据加工:基于原始数据以及算法规则HER2型乳腺癌的术后状态,生成阴性阳性分型标记,具体规则为:如果Her2满足3+或Her2满足2+且同时术后Fish为扩增则标记为阳性,其余标记为阴性。
This dataset, after sample data processing and curation, is provided for training auxiliary diagnostic AI models. It aims to enable the AI models to better comprehend the characteristics of HER2-positive breast cancer in the context of Jiangsu province patient cohorts, extract meaningful features, identify underlying patterns, and ultimately improve the accuracy, robustness, and generalization ability of the diagnostic AI models. 1. Data Collection: Through formal cooperative agreements, anonymized clinical data of patient samples is obtained from medical institutions, including post-surgical pathological results, postoperative HER2 status, and postoperative FISH test results. 2. Data Processing: The collected data is inspected and verified to ensure that all entries are fully de-identified, irreversibly anonymized and non-recoverable. Entries without pathological results are removed, abnormal data is cleaned and eliminated, and generative supplementation is performed for partially missing data. 3. Data Curation: Based on the original data and algorithmic rules for the postoperative status of HER2-positive breast cancer, negative/positive classification labels are generated. The specific labeling rules are as follows: A sample is labeled as positive if HER2 is scored as 3+, or if HER2 is scored as 2+ and the concomitant postoperative FISH test demonstrates amplification; all other samples are labeled as negative.
提供机构:
杭州智圆惠方科技有限公司
创建时间:
2023-12-06
搜集汇总
数据集介绍

特点
该数据集包含150条江苏省HER2型乳腺癌的临床数据,用于辅助诊断人工智能模型的训练。数据经过匿名化处理,并按照特定算法规则进行阴性阳性分型标记,以提高模型的诊断准确性。
以上内容由遇见数据集搜集并总结生成



