2DeteCT - A large 2D expandable, trainable, experimental Computed Tomography dataset for machine learning: Slices 3,001-4,000 (reference reconstructions and segmentations)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8017617
下载链接
链接失效反馈官方服务:
资源简介:
This upload contains the reference reconstructions and segmentation of slices 3,001 – 4,000 from the data collection described in
Maximilian B. Kiss, Sophia B. Coban, K. Joost Batenburg, Tristan van Leeuwen, and Felix Lucka “"2DeteCT - A large 2D expandable, trainable, experimental Computed Tomography dataset for machine learning", Sci Data 10, 576 (2023) or arXiv:2306.05907 (2023)
Abstract:
"Recent research in computational imaging largely focuses on developing machine learning (ML) techniques for image reconstruction, which requires large-scale training datasets consisting of measurement data and ground-truth images. However, suitable experimental datasets for X-ray Computed Tomography (CT) are scarce, and methods are often developed and evaluated only on simulated data. We fill this gap by providing the community with a versatile, open 2D fan-beam CT dataset suitable for developing ML techniques for a range of image reconstruction tasks. To acquire it, we designed a sophisticated, semi-automatic scan procedure that utilizes a highly-flexible laboratory X-ray CT setup. A diverse mix of samples with high natural variability in shape and density was scanned slice-by-slice (5000 slices in total) with high angular and spatial resolution and three different beam characteristics: A high-fidelity, a low-dose and a beam-hardening-inflicted mode. In addition, 750 out-of-distribution slices were scanned with sample and beam variations to accommodate robustness and segmentation tasks. We provide raw projection data, reference reconstructions and segmentations based on an open-source data processing pipeline."
The data collection has been acquired using a highly flexible, programmable and custom-built X-ray CT scanner, the FleX-ray scanner, developed by TESCAN-XRE NV, located in the FleX-ray Lab at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, Netherlands. It consists of a cone-beam microfocus X-ray point source (limited to 90 kV and 90 W) that projects polychromatic X-rays onto a 14-bit CMOS (complementary metal-oxide semiconductor) flat panel detector with CsI(Tl) scintillator (Dexella 1512NDT) and 1536-by-1944 pixels, \(74.8\mu m^2\) each. To create a 2D dataset, a fan-beam geometry was mimicked by only reading out the central row of the detector. Between source and detector there is a rotation stage, upon which samples can be mounted. The machine components (i.e., the source, the detector panel, and the rotation stage) are mounted on translation belts that allow the moving of the components independently from one another.
Please refer to the paper for all further technical details.
The complete dataset can be found via the following links: 1-1000, 1001-2000, 2001-3000, 3001-4000, 4001-5000, OOD.
The reference reconstructions and segmentations can be found via the following links: 1-1000, 1001-2000, 2001-3000, 3001-4000, 4001-5000, OOD.
The corresponding Python scripts for loading, pre-processing, reconstructing and segmenting the projection data in the way described in the paper can be found on github. A machine-readable file with the used scanning parameters and instrument data for each acquisition mode as well as a script loading it can be found on the GitHub repository as well.
Note: It is advisable to use the graphical user interface when decompressing the .zip archives. If you experience a zipbomb error when unzipping the file on a Linux system rerun the command with the UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE environment variable by setting in your .bashrc “export UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE”.
For more information or guidance in using the data collection, please get in touch with
Maximilian.Kiss [at] cwi.nl
Felix.Lucka [at] cwi.nl
本上传文件包含下述文献所述数据集的第3001至4000层的参考重建结果与分割结果:
Maximilian B. Kiss、Sophia B. Coban、K. Joost Batenburg、Tristan van Leeuwen 与 Felix Lucka 发表的《2DeteCT——面向机器学习的大型可扩展二维可训练实验计算机断层扫描数据集》,发表于 Sci Data 10, 576 (2023),或预印本 arXiv:2306.05907 (2023)
**摘要**:
计算成像领域的近期研究多聚焦于面向图像重建的机器学习(Machine Learning, ML)技术开发,此类技术需要包含测量数据与真值图像的大规模训练数据集。然而,适用于X射线计算机断层扫描(X-ray Computed Tomography, CT)的实验数据集仍较为匮乏,现有方法往往仅在仿真数据上开展开发与评估。为填补这一空白,我们为学界提供了一款通用的开源二维扇束计算机断层扫描(fan-beam CT)数据集,可用于开发面向多种图像重建任务的ML技术。
为采集该数据集,我们设计了一套精密的半自动扫描流程,依托高度灵活的实验室X射线CT平台完成数据采集。我们对形状与密度具有高度自然变异性的多样化样本进行逐片扫描(总计5000层),扫描具备高角度与空间分辨率,并采用三种不同的射线束特性模式:高保真模式、低剂量模式与束硬化诱导模式。此外,我们还针对样本与射线束设置了多种变体,扫描得到750层分布外(out-of-distribution, OOD)样本,以适配模型鲁棒性与分割任务的需求。本数据集提供原始投影数据、参考重建结果以及基于开源数据处理流程生成的分割结果。
本数据集采集自一款高度灵活、可编程的定制化X射线CT扫描仪——FleX-ray扫描仪,由位于荷兰阿姆斯特丹数学与计算机科学中心(Centrum Wiskunde & Informatica, CWI)FleX-ray实验室的TESCAN-XRE NV开发。该扫描仪搭载锥束微焦点X射线点光源(最大额定参数为90 kV与90 W),将多色X射线投射至搭载碘化铯(Tl)闪烁体的14位互补金属氧化物半导体(complementary metal-oxide semiconductor, CMOS)平板探测器(Dexella 1512NDT),探测器像素尺寸为74.8μm²,分辨率为1536×1944。为构建二维数据集,我们仅读取探测器的中心行以模拟扇束几何构型。光源与探测器之间设有旋转载台,用于放置样本。扫描仪的各组件(即光源、探测器面板与旋转载台)均安装于平移传动带上,可独立进行位移调整。
更多技术细节请参阅原文。
完整数据集可通过以下链接获取:1-1000、1001-2000、2001-3000、3001-4000、4001-5000以及分布外样本集(OOD)。
参考重建结果与分割结果可通过以下链接获取:1-1000、1001-2000、2001-3000、3001-4000、4001-5000以及分布外样本集(OOD)。
与论文所述方法一致的、用于加载、预处理、重建及分割投影数据的配套Python脚本已上传至GitHub仓库。同时该仓库中还提供了包含各采集模式下扫描参数与仪器数据的机器可读文件,以及用于加载该文件的配套脚本。
**注意事项**:
解压.zip压缩包时建议使用图形用户界面。若在Linux系统中解压时遇到压缩炸弹(zipbomb)错误,请通过在.bashrc配置文件中添加`export UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE`命令,将`UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE`设为环境变量后重新执行解压命令。
如需获取更多数据集使用相关信息或指导,请联系:
Maximilian.Kiss [at] cwi.nl
Felix.Lucka [at] cwi.nl
创建时间:
2023-09-25



