five

Metadata and data files supporting the related article: Application of convolutional neural networks to breast biopsies to delineate tissue correlates of mammographic breast density

收藏
DataCite Commons2020-08-26 更新2024-07-27 收录
下载链接:
https://springernature.figshare.com/articles/Metadata_and_data_files_supporting_the_related_article_Application_of_convolutional_neural_networks_to_breast_biopsies_to_delineate_tissue_correlates_of_mammographic_breast_density/9786152
下载链接
链接失效反馈
官方服务:
资源简介:
Breast density is a radiologic feature that reflects fibroglandular tissue content relative to breast area or volume, and it is a breast cancer risk factor. This study employed deep learning approaches to identify histologic correlates in radiologically-guided biopsies that may underlie breast density and distinguish cancer among women with elevated and low density.<br> <b>Data access: </b>Datasets supporting figure 2, tables 2 and 3 and supplementary table 2 of the published article are publicly available in the figshare repository, as part of this data record (https://doi.org/10.6084/m9.figshare.9786152). These datasets are contained in the zip file <b>NPJ FigShare.zip.</b> Datasets supporting figure 3, table 1 and supplementary table 1 of the published article are not publicly available to protect patient privacy, but can be made available on request from Dr. Gretchen L. Gierach, Senior Investigator, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA, email address: gierachg@mail.nih.gov.<br> <b>Study description and aims: </b>The study aimed to identify tissue correlates of breast density that may be important for distinguishing malignant from benign biopsy diagnoses separately among women with high and low breast density, to help inform cancer risk stratification among women undergoing a biopsy following an abnormal mammogram. Haematoxylin and eosin (H&amp;E)-stained digitized images from image-guided breast biopsies (n=852 patients) were evaluated. Breast density was assessed as global and localized fibroglandular volume (%). A convolutional neural network characterized H&amp;E composition. 37 features were extracted from the network output, describing tissue quantities and morphological structure. A random forest regression model was trained to identify correlates most predictive of fibroglandular volume (n=588). Correlations between predicted and radiologically quantified fibroglandular volume were assessed in 264 independent patients. A second random forest classifier was trained to predict diagnosis (invasive vs. benign); performance was assessed using area under receiver-operating characteristics curves (AUC). For more details on the methodology please see the published article.<br> <b>Study approval: </b>The Institutional Review Boards at the NCI and the University of Vermont approved the protocol for this project for either active consenting or a waiver of consent to enrol participants, link data and perform analytical studies.<br> <b>Dataset descriptions: </b><b><br></b> <b>Data supporting figure 2: </b>Datasets <b>Figure 2A H&amp;E.jpg</b>, <b>Figure 2A Mammogram.jpg</b>, <b>Figure 2B H&amp;E.jpg</b> and <b>Figure 2B Mammogram.jpg</b> are in <b>.jpg file</b> format and consist of histological whole slide H&amp;E images and corresponding full-field digital mammograms from patients whose biopsies yielded diagnoses of atypical ductal hyperplasia and invasive carcinoma. <br> <b>Data supporting figure 3: </b>Dataset <b>Figure 3.xls </b>is in<b> .xls </b>file format<b> </b>and contains<b> </b>raw data used to generate the<b> </b>Receiver Operating Characteristic (ROC) curves for the prediction of invasive cancer among women with high percent global fibroglandular volume, low percent global fibroglandular volume, high percent localized fibroglandular volume and low percent localized fibroglandular volume.<b></b><br> <b>Data supporting table 1: </b>Dataset<b> </b><b>Table1_analysis.sas7bdat </b>is in<b> SAS file format</b> and contains the characteristics of study participants in the BREAST Stamp Project, who were referred for an image-guided breast biopsy, stratified by the training and testing sets (n = 852).<br> <b>Data supporting table 2: Datasets Global FGV.xls </b>(accompanying<b> Global FGV.png </b>file)<b> </b>and <b>Localized FGV.xls </b>(accompanying<b> Localized FGV.png </b>file)<b> </b>are in<b> .xls </b>file format and the accompanying files are in<b> .png </b>file format. The data contain histologic features identified in the random forest model for the prediction of global and localized % fibroglandular volume.<br> <b>Data supporting table 3: </b>Datasets <b>HighGlobal_feature_importance.xls, HighGlobal_feature_importance.pdf, HighLocal_feature_importance.xls, HighLocal_feature_importance.pdf, LowGlobal_feature_importance.xls, LowGlobal_feature_importance.pdf, LowLocal_feature_importance.xls, LowLocal_feature_importance.pdf </b>are in<b> .xls </b>file format. The accompanying figures generated from the data in the .xls files are in <b>.pdf</b> file format. These files<b> </b>contain histologic features identified in the random forest model for the prediction of invasive cancer status among women with high vs. low % fibroglandular volume.<b></b><br> <b>Data supporting supplementary table 1: </b>Datasets <b>testfeatures.xls</b> and <b>trainfeatures.xls </b>are in<b> .xls </b>file format and<b> </b>include the distribution and description<b> </b>of the 37 histologic features extracted from the convolutional neural network deep learning output in the H&amp;E stained whole slide images from the training and testing sets.<b></b><br> <b>Data supporting supplementary table 2: </b>Datasets <b>All_samples_global.xls, All_samples_global.png, All_samples_local.xls, All_samples_local.png, PostMeno_global.xls, PostMeno_global.png, PostMeno_local.xls, PostMeno_local.png, PreMeno_global.xls, PreMeno_global.png, PreMeno_local.xls, PreMeno_local.png </b>are in<b> .xls</b> file format.<b> </b>The accompanying figures generated from the data in the .xls files are in <b>.png </b>file format. These data include the histologic features identified in the random forest model that included BMI for the prediction of global and localized % fibroglandular volume.<b></b><br><b>Software needed to access the data:</b> Data files in SAS file format require the SAS software to be accessed.
提供机构:
figshare
创建时间:
2019-09-11
二维码
社区交流群
二维码
科研交流群
商业服务