Kumar
收藏OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Kumar
下载链接
链接失效反馈官方服务:
资源简介:
Kumar 数据集包含来自癌症基因组图谱 (TCGA) 数据库的 7 个器官(6 个乳房、6 个肝脏、6 个肾脏、6 个前列腺、2 个膀胱、2 个结肠和 2 个胃)的 30 个 1,000×1,000 图像切片,在 40 × 放大倍数。在每张图像中,每个细胞核的边界都被完全注释。这个挑战的数据集是通过仔细注释几名患有不同器官肿瘤并在多家医院被诊断出的患者的组织图像获得的。该数据集是通过从 TCGA 存档下载以 40 倍放大倍率捕获的 H&E 染色组织图像创建的。 H&E 染色是增强组织切片对比度的常规方案,通常用于肿瘤评估(分级、分期等)。考虑到多个器官和患者的细胞核外观的多样性,以及多家医院采用的丰富染色方案,训练数据集将能够开发出强大且可推广的细胞核分割技术,开箱即用。
The Kumar Dataset consists of 30 1,000×1,000 image patches from 7 organs (6 breast, 6 liver, 6 kidney, 6 prostate, 2 bladder, 2 colon, and 2 stomach) sourced from The Cancer Genome Atlas (TCGA) database, captured at 40× magnification. The boundary of each cell nucleus has been fully annotated across all images. This challenge dataset was obtained through meticulous annotation of histopathology images from patients diagnosed with tumors in different organs, who received diagnoses at multiple hospitals. The dataset was created by downloading H&E-stained histopathology images captured at 40× magnification from the TCGA archive. H&E staining is a standard routine protocol that enhances the contrast of tissue sections, and is commonly employed for tumor assessment such as grading, staging and other related analyses. Given the diversity of nuclear morphological features across multiple organs and patient cohorts, as well as the varied staining protocols adopted by different hospitals, this training dataset can enable the development of robust and generalizable nuclear segmentation techniques that perform reliably out-of-the-box.
提供机构:
OpenDataLab
创建时间:
2022-04-29
搜集汇总
数据集介绍

背景与挑战
背景概述
Kumar数据集是一个用于多组织细胞核分割的医学图像数据集,包含来自TCGA数据库的30个1,000×1,000图像切片,覆盖7个不同器官,每个细胞核都有完整边界注释。该数据集旨在通过多样化的患者和医院数据,开发鲁棒且可推广的细胞核分割技术,适用于癌症评估等医学研究。
以上内容由遇见数据集搜集并总结生成



