five

安徽省乳腺癌分类辅助诊断模型训练数据

收藏
浙江省数据知识产权登记平台2024-01-11 更新2024-05-08 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/26862
下载链接
链接失效反馈
官方服务:
资源简介:
通过对样本的数据处理和数据加工,提供给辅助诊断人工智能模型进行训练,帮助人工智能模型更好地理解安徽省样本场景下将乳腺癌分型,提取特征,发现规律,最终提高诊断人工智能模型的准确性、鲁棒性和泛化能力。1数据采集:通过正式合作协议,从医疗机构取得匿名化的样本临床数据,包括是否有术后病理结果、术后ER(雌性激素受体)、术后PR(孕激素受体)、术后Her2情况、术后Fish情况,同时还要获取系统内术后Her2阴性阳性分型标记;2数据处理:对数据进行检查核对,确保所有数据去标志化,处于完全匿名化状态且不可还原的状态,将没有病理结果的数据去除,对异常数据进行清洗去除,对部分缺失数据进行生成式补充;3数据加工:基于原始数据和算法规则,生成乳腺癌分类标记,具体判规则为:如果ER满足阳性,同时PR满足阳性,同时HER-2满足阴性,则标记为Luminal型,反之则标记为其他类型。

This dataset is developed through sample data processing and curation, and is provided for training auxiliary diagnostic artificial intelligence (AI) models. It aims to help the models better perform breast cancer subtyping in the context of patient samples from Anhui Province, extract key features, discover underlying patterns, and ultimately improve the accuracy, robustness and generalization ability of the diagnostic AI models. 1. Data Collection: Anonymized clinical sample data was obtained from medical institutions via formal cooperation agreements, including postoperative pathological results, postoperative estrogen receptor (ER) status, postoperative progesterone receptor (PR) status, postoperative human epidermal growth factor receptor 2 (Her2) status, postoperative fluorescence in situ hybridization (FISH) results, as well as in-system postoperative Her2 positive/negative subtyping labels. 2. Data Processing: All data is inspected and verified to ensure full de-identification and non-reidentifiability. Samples without pathological results are removed, abnormal data is cleaned and excluded, and partially missing data is supplemented via generative imputation. 3. Data Curation: Breast cancer classification labels are generated based on the original data and predefined algorithmic rules. The specific judgment criteria are as follows: If ER is positive, PR is positive, and HER-2 is negative, the sample is labeled as the Luminal subtype; otherwise, it is labeled as other subtypes.
提供机构:
杭州智圆惠方科技有限公司
创建时间:
2023-12-07
搜集汇总
数据集介绍
main_image_url
特点
该数据集包含150条安徽省乳腺癌患者的临床数据,用于辅助诊断人工智能模型的训练,以提高模型的准确性、鲁棒性和泛化能力。数据每年更新,涵盖了术后病理结果、ER、PR、Her2等关键指标,并通过特定算法规则生成乳腺癌分类标记。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务