2DeteCT - A large 2D expandable, trainable, experimental Computed Tomography dataset for machine learning: Slices 2,001-3,000
收藏Mendeley Data2024-05-10 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/8014787
下载链接
链接失效反馈官方服务:
资源简介:
This upload contains slices 2,001 – 3,000 from the data collection described in Maximilian B. Kiss, Sophia B. Coban, K. Joost Batenburg, Tristan van Leeuwen, and Felix Lucka “"2DeteCT - A large 2D expandable, trainable, experimental Computed Tomography dataset for machine learning", Sci Data 10, 576 (2023) or arXiv:2306.05907 (2023) Abstract: "Recent research in computational imaging largely focuses on developing machine learning (ML) techniques for image reconstruction, which requires large-scale training datasets consisting of measurement data and ground-truth images. However, suitable experimental datasets for X-ray Computed Tomography (CT) are scarce, and methods are often developed and evaluated only on simulated data. We fill this gap by providing the community with a versatile, open 2D fan-beam CT dataset suitable for developing ML techniques for a range of image reconstruction tasks. To acquire it, we designed a sophisticated, semi-automatic scan procedure that utilizes a highly-flexible laboratory X-ray CT setup. A diverse mix of samples with high natural variability in shape and density was scanned slice-by-slice (5000 slices in total) with high angular and spatial resolution and three different beam characteristics: A high-fidelity, a low-dose and a beam-hardening-inflicted mode. In addition, 750 out-of-distribution slices were scanned with sample and beam variations to accommodate robustness and segmentation tasks. We provide raw projection data, reference reconstructions and segmentations based on an open-source data processing pipeline." The data collection has been acquired using a highly flexible, programmable and custom-built X-ray CT scanner, the FleX-ray scanner, developed by TESCAN-XRE NV, located in the FleX-ray Lab at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, Netherlands. It consists of a cone-beam microfocus X-ray point source (limited to 90 kV and 90 W) that projects polychromatic X-rays onto a 14-bit CMOS (complementary metal-oxide semiconductor) flat panel detector with CsI(Tl) scintillator (Dexella 1512NDT) and 1536-by-1944 pixels, \(74.8\mu m^2\) each. To create a 2D dataset, a fan-beam geometry was mimicked by only reading out the central row of the detector. Between source and detector there is a rotation stage, upon which samples can be mounted. The machine components (i.e., the source, the detector panel, and the rotation stage) are mounted on translation belts that allow the moving of the components independently from one another. Please refer to the paper for all further technical details. The complete dataset can be found via the following links: 1-1000, 1001-2000, 2001-3000, 3001-4000, 4001-5000, OOD. The reference reconstructions and segmentations can be found via the following links: 1-1000, 1001-2000, 2001-3000, 3001-4000, 4001-5000, OOD. The corresponding Python scripts for loading, pre-processing, reconstructing and segmenting the projection data in the way described in the paper can be found on github. A machine-readable file with the used scanning parameters and instrument data for each acquisition mode as well as a script loading it can be found on the GitHub repository as well. Note: It is advisable to use the graphical user interface when decompressing the .zip archives. If you experience a zipbomb error when unzipping the file on a Linux system rerun the command with the UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE environment variable by setting in your .bashrc “export UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE”. For more information or guidance in using the data collection, please get in touch with Maximilian.Kiss [at] cwi.nl Felix.Lucka [at] cwi.nl
本上传内容包含来自Maximilian B. Kiss、Sophia B. Coban、K. Joost Batenburg、Tristan van Leeuwen与Felix Lucka发表于《Sci Data 10, 576 (2023)》或arXiv:2306.05907 (2023)的论文《2DeteCT——面向机器学习的大型可扩展可训练二维实验计算机断层扫描(Computed Tomography, CT)数据集》中的第2001至3000层切片。
摘要:当前计算成像领域的研究大多聚焦于开发用于图像重建的机器学习(Machine Learning, ML)技术,而这类技术的训练需要包含测量数据与真值图像的大规模数据集。然而,适用于X射线计算机断层扫描(X-ray Computed Tomography, CT)的实验数据集十分稀缺,相关方法往往仅在模拟数据上进行开发与评估。我们通过向社区提供一款通用且开源的二维扇束CT数据集,填补了这一空白,该数据集可用于开发面向多种图像重建任务的机器学习技术。为采集该数据集,我们设计了一套复杂的半自动扫描流程,依托高度灵活的实验室X射线CT装置完成。我们对形状与密度具有高度自然变异性的多样化样本逐片扫描(总计5000层切片),扫描具备高角度与空间分辨率,并采用三种不同的射线束特性模式:高保真模式、低剂量模式以及束流硬化诱导模式。此外,我们还扫描了750张分布外(out-of-distribution, OOD)切片,通过调整样本与射线束参数,以适配鲁棒性与图像分割任务。本数据集包含原始投影数据、参考重建图像以及基于开源数据处理流程生成的图像分割结果。
本数据集依托荷兰阿姆斯特丹数学与计算机科学研究中心(Centrum Wiskunde & Informatica, CWI)的FleX-ray实验室中,由TESCAN-XRE NV研发的高度灵活、可编程定制X射线CT扫描仪(FleX-ray扫描仪)完成采集。该扫描仪包含一个锥形束微焦点X射线点源(额定参数为90 kV、90 W),可将多色X射线投射至搭载碘化铯(Tl)闪烁体的14位互补金属氧化物半导体(complementary metal-oxide-semiconductor, CMOS)平板探测器(Dexella 1512NDT)上,探测器像素阵列尺寸为1536×1944,单像素面积为74.8 μm²。为构建二维数据集,我们仅读取探测器的中心行像素,以模拟扇束几何构型。在射线源与探测器之间设有旋转载物台,用于放置样本。扫描仪的核心组件(射线源、探测器面板与旋转载物台)均安装于平移传动带上,可实现各组件独立运动。更多技术细节请参阅原论文。
完整数据集可通过以下链接获取:1-1000、1001-2000、2001-3000、3001-4000、4001-5000以及OOD子集。参考重建图像与分割结果可通过以下链接获取:1-1000、1001-2000、2001-3000、3001-4000、4001-5000以及OOD子集。与论文所述投影数据加载、预处理、重建及分割流程对应的Python脚本可在GitHub平台获取。此外,GitHub仓库中还提供了包含各采集模式下扫描参数与仪器数据的机器可读文件,以及用于加载该文件的脚本。
注意:解压.zip压缩包时建议使用图形用户界面。若在Linux系统中解压时遭遇zip炸弹错误,可通过在.bashrc文件中添加`export UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE`配置项,运行命令时启用`UNZIP_DISABLE_ZIPBOMB_DETECTION=TRUE`环境变量以解决问题。如需获取数据集使用的更多信息或指导,请联系Maximilian.Kiss [at] cwi.nl或Felix.Lucka [at] cwi.nl。
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是2DeteCT数据集的一部分,包含1000个二维实验性计算机断层扫描切片(编号2001-3000),专为机器学习中的图像重建任务设计。数据通过高灵活性的X射线CT扫描仪获取,具有三种不同光束特性模式,并提供了原始投影数据、参考重建和分割结果,支持机器学习算法的开发和评估。
以上内容由遇见数据集搜集并总结生成



