five

2 million histological images of breast cancer tumors with her2 labels

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8383579
下载链接
链接失效反馈
官方服务:
资源简介:
Data Description This is a 2 million set of non-overlapping image patches from hematoxylin & eosin (H&E) stained histological images of human breast cancer tumor tissue. The anonymized dataset comes from a cohort of BC patients from the A. C. Camargo Cancer Center (ACCCC, N = 504). All patients were treated for breast cancer at the ACCCC between 2019 and 2021. As part of their diagnosis, in HER2 IHC score 2+ cases, patients' HER2 status was determined following the ASCO guidelines updated in 2018, with visual evaluation of IHC assay and either a FISH or DDISH test. All cases with metastasis or neoadjuvant treatment were excluded. A total of 426 H&E stained high resolution images (40x magnification) were scanned from biopsy and resection tissue samples with a Leica Aperio AT2 scanner. Ethical approval of the ACCCC study was given by the ethics committee of the Fundação Antônio Prudente. We divided the cases into the following 3 groups according to the results of the IHC and ISH tests: HER2-negative, HER2-low and HER2-high. The slides were divided into 256 px x 256 px tiles at 0.5 um/pixel magnification. Then, we used a custom trained ConvNext-tiny neural network to only include tiles from the tumor region and its environment, generating a total of 2051877 image patches. A sample is considered her2-negative with an IHC score of 0; her2-low with an IHC score of 1+ or an IHC score of 2+ with a negative ISH-based test result, and her2-high with an IHC score of 2+ with a positive ISH-based test or an IHC score of 3+. The accompanying code used for training the models is available at https://github.com/tojallab/wsi-mil
创建时间:
2024-08-20
二维码
社区交流群
二维码
科研交流群
商业服务