A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis
收藏www.cancerimagingarchive.net2025-01-22 收录
下载链接:
https://www.cancerimagingarchive.net/collection/lung-pet-ct-dx/
下载链接
链接失效反馈官方服务:
资源简介:
<p>This dataset consists of CT and PET-CT DICOM images of lung cancer subjects with XML Annotation files that indicate tumor location with bounding boxes. The images were retrospectively acquired from patients with suspicion of lung cancer, and who underwent standard-of-care lung biopsy and PET/CT. Subjects were grouped according to a tissue histopathological diagnosis. Patients with Names/IDs containing the letter 'A' were diagnosed with Adenocarcinoma, 'B' with Small Cell Carcinoma, 'E' with Large Cell Carcinoma, and 'G' with Squamous Cell Carcinoma.</p><p>The images were analyzed on the mediastinum (window width, 350 HU; level, 40 HU) and lung (window width, 1,400 HU; level, –700 HU) settings. The reconstructions were made in 2mm-slice-thick and lung settings. The CT slice interval varies from 0.625 mm to 5 mm. Scanning mode includes plain, contrast and 3D reconstruction. </p><p>Before the examination, the patient underwent fasting for at least 6 hours, and the blood glucose of each patient was less than 11 mmol/L. Whole-body emission scans were acquired 60 minutes after the intravenous injection of 18F-FDG (4.44MBq/kg, 0.12mCi/kg), with patients in the supine position in the PET scanner. FDG doses and uptake times were 168.72-468.79MBq (295.8±64.8MBq) and 27-171min (70.4±24.9 minutes), respectively. 18F-FDG with a radiochemical purity of 95% was provided. Patients were allowed to breathe normally during PET and CT acquisitions. Attenuation correction of PET images was performed using CT data with the hybrid segmentation method. Attenuation corrections were performed using a CT protocol (180mAs,120kV,1.0pitch). Each study comprised one CT volume, one PET volume and fused PET and CT images: the CT resolution was 512 × 512 pixels at 1mm × 1mm, the PET resolution was 200 × 200 pixels at 4.07mm × 4.07mm, with a slice thickness and an interslice distance of 1mm. Both volumes were reconstructed with the same number of slices. Three-dimensional (3D) emission and transmission scanning were acquired from the base of the skull to mid femur. The PET images were reconstructed via the TrueX TOF method with a slice thickness of 1mm. </p><p>The location of each tumor was annotated by five academic thoracic radiologists with expertise in lung cancer to make this dataset a useful tool and resource for developing algorithms for medical diagnosis. Two of the radiologists had more than 15 years of experience and the others had more than 5 years of experience. After one of the radiologists labeled each subject the other four radiologists performed a verification, resulting in all five radiologists reviewing each annotation file in the dataset. Annotations were captured using <a href="https://pypi.org/project/labelImg/">Labellmg</a>. The image annotations are saved as XML files in PASCAL VOC format, which can be parsed using the PASCAL Development Toolkit: <a href="https://pypi.org/project/pascal-voc-tools/">https://pypi.org/project/pascal-voc-tools/</a>. Python code to visualize the annotation boxes on top of the DICOM images can be <a href="/wp-content/uploads/VisualizationTools.zip" download="VisualizationTools.zip">downloaded here</a>.</p><p>Two deep learning researchers used the images and the corresponding annotation files to train several well-known detection models which resulted in a maximum <em>a posteriori</em> probability (MAP) of around 0.87 on the validation set. </p>
本数据集包含肺癌患者的 CT 和 PET-CT DICOM 图像及其对应的 XML 标注文件,标注文件中通过边界框指示肿瘤位置。图像系从疑似患有肺癌并接受标准治疗性肺活检及 PET/CT 检查的患者中回顾性获取。受试者根据组织病理学诊断分组。姓名/ID 中包含字母 'A' 的患者被诊断为腺癌,'B' 为小细胞癌,'E' 为大细胞癌,'G' 为鳞状细胞癌。图像在纵隔(窗宽,350 HU;层次,40 HU)和肺部(窗宽,1,400 HU;层次,-700 HU)设置下进行分析。重建以 2mm 切片厚度和肺部设置进行。CT 切片间隔从 0.625 mm 至 5 mm 不等。扫描模式包括平扫、对比增强和三维重建。在检查前,患者至少禁食 6 小时,且血糖水平低于 11 mmol/L。在静脉注射 18F-FDG(4.44MBq/kg,0.12mCi/kg)后 60 分钟进行全身发射扫描,患者处于仰卧位于 PET 扫描仪中。FDG 剂量和摄取时间为 168.72-468.79MBq(295.8±64.8MBq)和 27-171 分钟(70.4±24.9 分钟),分别。提供的 18F-FDG 放射化学纯度为 95%。在 PET 和 CT 获取过程中,患者可正常呼吸。使用 CT 数据和混合分割方法对 PET 图像进行衰减校正。衰减校正采用 CT 方案(180mAs,120kV,1.0 倍数)。每项研究包括一个 CT 体积、一个 PET 体积以及融合的 PET 和 CT 图像:CT 分辨率为 512 × 512 像素,1mm × 1mm;PET 分辨率为 200 × 200 像素,4.07mm × 4.07mm,切片厚度和层间距均为 1mm。两个体积均使用相同数量的切片进行重建。从颅底至股骨中部的三维发射和透射扫描被获取。PET 图像通过 TrueX TOF 方法重建,切片厚度为 1mm。每位肿瘤的位置均由五位具有肺癌专业知识的学术胸外科放射学家标注,以使本数据集成为开发医学诊断算法的有用工具和资源。其中两位放射学家拥有超过 15 年的经验,其余四位则拥有超过 5 年的经验。在一位放射学家标注每个主题后,其他四位放射学家进行验证,从而确保所有五位放射学家都审阅了数据集中的每个标注文件。标注使用 <a href="https://pypi.org/project/labelImg/">Labellmg</a> 进行捕获。图像标注以 PASCAL VOC 格式保存为 XML 文件,可使用 PASCAL 开发工具包进行解析:<a href="https://pypi.org/project/pascal-voc-tools/">https://pypi.org/project/pascal-voc-tools/</a>。可视化 DICOM 图像上标注框的 Python 代码可在此下载:<a href="/wp-content/uploads/VisualizationTools.zip">VisualizationTools.zip</a>。两位深度学习研究人员使用图像及其相应的标注文件来训练多个知名检测模型,在验证集上实现了约 0.87 的最大后验概率 (MAP)。
提供机构:
The Cancer Imaging Archive
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个大规模CT和PET/CT图像集合,专为肺癌诊断设计,包含DICOM图像和XML标注文件,标注了肿瘤边界框。数据来源于疑似肺癌患者,按组织病理学类型分组,并由经验丰富的放射科医生进行高质量标注,已成功用于训练深度学习模型,验证集MAP值约0.87。
以上内容由遇见数据集搜集并总结生成



