Histopathology Non-Melanoma Skin Cancer Segmentation Dataset
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/histopathology-non-melanoma-segmentation-dataset/3369696
下载链接
链接失效反馈官方服务:
资源简介:
A collection of 290 images of non-melanoma skin cancer H&E tissue sections and hand-annotated segmentation masks. Access to a pre-existing collection of skin cancer slides was provided by MyLab Pathology (Salisbury, Australia). A pathologist selected 290 slides and specific tissue sections which were representative of typical cases of non-melanoma skin cancer. The cancer classes are Basal Cell Carcinoma (BCC - 140), Squamous Cell Carcinoma (SCC - 60) and Intra-Epidermal Carcinoma (IEC - 90). The set includes shave biopsies (100), punch biopsies (58) and excisional biopsies (132). The slides were produced using xylene processing and paraffin wax, and imaged over four months in late 2017 and early 2018. The slides were sourced from patients between the ages of 34 and 96, with a median age of 70 years. Female and male proportions were 2/3 and 1/3, respectively, closely reflecting the prevalence of non-melanoma skin cancer in the Australian population ( Staples et al., 2006 ). The slides were imaged using a DP27 Olympus microscope camera using the 10x magnification lens with the light condenser attached. Individual images were stitched together to build a high-resolution mosaic using software available at https://github.com/smthomas-sci/HistoImageStitcher . The resulting images had a resolution where 1 pixel corresponds to 0.67μm in length. The images are stored in TIF format. The segmentation masks were created in ImageJ, using colors to classify pixels into 12 tissue classes: Glands (GLD), Inflammation (INF), Hair Follicles (FOL), Hypodermis (HYP), Reticular Dermis (RET), Papillary Dermis (PAP), Epidermis (EPI), Keratin (KER), Background (BKG), BCC, SCC, and IEC. The color legend is available in the repository and the masks are stored in PNG format. The data are provided at smaller resolutions (2x, 5x and 10x downsample factors) as well as the original (1x). Cancer margin data is also available, which consist of (x,y) coordinates for the cancer margins for each image in CSV format. The training, validation and testing sets are provided to support benchmarking, and are the same used by Thomas et al. (2021). References: Staples, M.P. , Elwood, M. , Burton, R.C. , Williams, J.L. , Marks, R. , Giles, G.G. , 2006. Non-melanoma skin cancer in Australia: the 2002 national survey and trends since 1985. Med. J. Aust. 184, 6–10 . Thomas, S. M., Lefevre, J. G., Baxter, G., & Hamilton, N. A. , 2021. Interpretable deep learning systems for multi-class segmentation and classification of non-melanoma skin cancer. Medical Image Analysis, 68, 101915.
本数据集包含290张非黑色素瘤皮肤癌的苏木精-伊红(H&E)组织切片图像,以及经人工标注的分割掩码。
本数据集的皮肤癌玻片原始馆藏由澳大利亚索尔兹伯里的MyLab病理实验室(MyLab Pathology)提供。一名病理学家从中挑选出290张玻片及其对应组织切片,这些样本均为典型非黑色素瘤皮肤癌病例的代表性标本。
癌种分为三类:基底细胞癌(Basal Cell Carcinoma,BCC,140例)、鳞状细胞癌(Squamous Cell Carcinoma,SCC,60例)以及表皮内癌(Intra-Epidermal Carcinoma,IEC,90例)。样本涵盖100例刮除活检标本、58例钻孔活检标本以及132例切除活检标本。
所有玻片均采用二甲苯处理与石蜡包埋流程制备,并于2017年末至2018年初的四个月内完成成像。样本供体年龄跨度为34岁至96岁,中位年龄为70岁。女性与男性占比分别为2/3与1/3,这一比例与澳大利亚人群中非黑色素瘤皮肤癌的流行情况高度吻合(Staples等,2006)。
成像采用搭载聚光镜的奥林巴斯DP27显微镜摄像头,并使用10倍放大镜头完成拍摄。单张图像通过开源软件HistoImageStitcher(下载地址:https://github.com/smthomas-sci/HistoImageStitcher)拼接为高分辨率全景图像。生成的全景图像分辨率为每像素对应0.67微米。图像以TIF格式存储。
分割掩码通过ImageJ软件生成,采用颜色编码将像素分为12类组织:腺体(Glands,GLD)、炎症区域(Inflammation,INF)、毛囊(Hair Follicles,FOL)、皮下组织(Hypodermis,HYP)、网状真皮(Reticular Dermis,RET)、乳头状真皮(Papillary Dermis,PAP)、表皮(Epidermis,EPI)、角蛋白(Keratin,KER)、背景(Background,BKG)、基底细胞癌(BCC)、鳞状细胞癌(SCC)以及表皮内癌(IEC)。颜色编码图例可在数据集仓库中获取,分割掩码以PNG格式存储。
数据集同时提供原始分辨率(1x)以及2x、5x、10x下采样的低分辨率版本。数据集还提供癌边缘数据:以CSV格式存储的每张图像对应癌边缘的(x,y)坐标信息。
为支持模型基准测试,数据集划分了训练集、验证集与测试集,划分方式与Thomas等(2021)中使用的一致。
参考文献:
1. Staples, M.P., Elwood, M., Burton, R.C., Williams, J.L., Marks, R., Giles, G.G., 2006. 澳大利亚非黑色素瘤皮肤癌:2002年全国调查及1985年以来的流行趋势[J]. 澳大利亚医学杂志(Med. J. Aust.), 184: 6–10.
2. Thomas, S.M., Lefevre, J.G., Baxter, G., & Hamilton, N.A., 2021. 用于非黑色素瘤皮肤癌多类别分割与分类的可解释深度学习系统[J]. 医学图像分析(Medical Image Analysis), 68: 101915.
提供机构:
The University of Queensland



