PathoNet
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8116750
下载链接
链接失效反馈官方服务:
资源简介:
PathoNet is a general purpose dataset for digital pathology. It consists of 4,462,156 jpg images divided into 12 classes (tissues).
These images were extracted from TCGA (The Cancer Genome Atlas) data portal. No annotations were made, only the tissue type was taken from the slides metadata.
For each tissue, 400,000 256x256 pixel images were randomly selected and downloaded from 400 WSI. An automated cleaning process was then performed to eliminate cases with excessive white content and blurred images.
The dataset is already divided into Train, Test and Validation. When dividing the data, cases were taken into account to avoid mixing images from the same case in different partitions, i.e. all images corresponding to a particular case are in the same partition.
The final number of images for each class and partition are:
Tissue
Partition
# of images
Bladder
Train
308.677
Bladder
Validation
38.927
Bladder
Test
39.166
Brain
Train
313.890
Brain
Validation
39.665
Brain
Test
39.613
Breast
Train
303.949
Breast
Validation
37.499
Breast
Test
38.602
Bronchus and lung
Train
308.848
Bronchus and lung
Validation
37.730
Bronchus and lung
Test
39.160
Colon
Train
243.330
Colon
Validation
30.220
Colon
Test
32.135
Corpus uteri
Train
312.743
Corpus uteri
Validation
39.549
Corpus uteri
Test
39.184
Kidney
Train
311.005
Kidney
Validation
37.950
Kidney
Test
39.184
Liver and intrahepatic bile ducts
Train
314.707
Liver and intrahepatic bile ducts
Validation
38.689
Liver and intrahepatic bile ducts
Test
39.799
Prostate gland
Train
296.181
Prostate gland
Validation
36.568
Prostate gland
Test
36.376
Skin
Train
307.308
Skin
Validation
37.411
Skin
Test
38.487
Stomach
Train
295.002
Stomach
Validation
37.559
Stomach
Test
36.112
Thyroid gland
Train
258.415
Thyroid gland
Validation
33.667
Thyroid gland
Test
33.849
For convenience, the training data has been uploaded by class.
创建时间:
2023-07-20



