five

DICOM converted Slide Microscopy images for the TCGA-LUSC collection

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12690008
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: TCGA-LUSC. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below. Collection description The Cancer Genome Atlas-Lung Squamous Cell Carcinoma (TCGA-LUSC) data collection is part of a larger effort to enhance the TCGA http://cancergenome.nih.gov/ data set with characterized radiological images. The Cancer Imaging Program (CIP) with the cooperation of several of the TCGA tissue-contributing institutions are working to archive a large portion of the radiological images of the LUSC cases. Please see the TCGA-LUSC page to learn more about the images and to obtain any supporting metadata for this collection. Files included A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced. tcga_lusc-idc_v8-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets tcga_lusc-idc_v8-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets tcga_lusc-idc_v8-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids) Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP. Download instructions Each of the manifests include instructions in the header on how to download the included files. To download the files using .s5cmd manifests: install idc-index package: pip install --upgrade idc-index download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd. To download the files using .dcf manifest, see manifest header. Acknowledgments Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l. References [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

本数据集涵盖了可从美国国家癌症研究所影像数据共享库(Imaging Data Commons,简称IDC)[1]获取的图像及/或图像衍生数据。本数据集已被转换为医学数字成像和通信格式(Digital Imaging and Communications in Medicine,简称DICOM),并由IDC团队完成数据入库。您可通过IDC门户访问TCGA-LUSC页面,浏览并可视化对应图像。您可遵照下方下载说明,使用本Zenodo存档中包含的清单文件下载该数据集的全部内容。 数据集说明 本癌症基因组图谱-肺鳞状细胞癌(The Cancer Genome Atlas-Lung Squamous Cell Carcinoma,简称TCGA-LUSC)数据集是整体项目的一部分,该项目旨在为TCGA(http://cancergenome.nih.gov/)数据集添加经过标注的放射学影像。美国国家癌症研究所癌症影像项目(Cancer Imaging Program,简称CIP)联合多家TCGA组织标本贡献机构,正致力于归档LUSC病例的大部分放射学影像。 您可访问TCGA-LUSC页面,了解该数据集的更多详情,并获取本数据集的相关支撑元数据。 包含文件说明 清单文件的文件名可标识该数据集版本首次上线的IDC数据发布版本。例如,`collection_id-idc_v8-aws.s5cmd`对应于IDC数据发布版本v8中首次上线的collection_id数据集内容。若本Zenodo存档后续更新版本,将一并标注对应数据集新版本的上线时间。 `tcga_lusc-idc_v8-aws.s5cmd`:用于从IDC公开亚马逊云服务(Amazon Web Services,简称AWS)存储桶下载文件的清单 `tcga_lusc-idc_v8-gcs.s5cmd`:用于从IDC公开谷歌云存储(Google Cloud Storage,简称GCS)存储桶下载文件的清单 `tcga_lusc-idc_v8-dcf.dcf`:Gen3格式清单(详情请参见https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids) 注意:文件名以`-aws.s5cmd`结尾的清单指向存储于AWS存储桶的文件,以`-gcs.s5cmd`结尾的清单则指向谷歌云存储中的文件。两类清单指向的实际文件完全一致,且在AWS与谷歌云平台(GCP)间实现了镜像同步。 下载说明 所有清单文件的头部均包含对应文件的下载指南。 使用.s5cmd格式清单下载文件: 1. 安装idc-index工具包:`pip install --upgrade idc-index` 2. 通过传入.s5cmd格式清单文件,下载本数据集清单中引用的文件:`idc download manifest.s5cmd` 使用.dcf格式清单下载文件,请查阅清单头部的说明。 致谢 影像数据共享库团队的研究经费全部或部分来源于美国国立卫生研究院下属国家癌症研究所的联邦拨款,相关项目编号为合同HHSN261201500003l下的任务订单HHSN26110071。 参考文献 [1] Fedorov A, Longabaugh W J R, Pot D, 等. 美国国家癌症研究所影像数据共享库:迈向影像人工智能的透明化、可复现性与规模化[J]. RadioGraphics, 2023. https://doi.org/10.1148/rg.230180
创建时间:
2024-08-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作