five

MPI-MNIST Dataset

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12799416
下载链接
链接失效反馈
官方服务:
资源简介:
A dataset for magnetic particle imaging based on the MNIST dataset. This dataset contains simulated MPI measurements along with ground truth phantoms selected from the MNIST database of handwritten digits. A state-of-the-art model-based system matrix is used to simulate the MPI measurements of the MNIST phantoms. These measurements are equipped with noise perturbations captured by the preclinical MPI system (Bruker, Ettlingen, Germany). The dataset can be utilized in its provided form, while additional data is included to offer flexibility for creating customized versions. MPI-MNIST features four different system matrices, each available in three spatial resolutions. The provided data is generated using a specified system matrix at highest spatial resolution. Reconstruction operations can be performed by using any of the provided system matrices at a lower resolution. This setup allows for simulating reconstructions from either an exact or an inexact forward operator. To cover further operator deviation setups, we provide additional noise data for the application of pixelwise noise to the reconstruction system matrix. For supporting the development of learning-based methods, a large amount of further noise samples, captured by the Bruker scanner, is provided. For a detailed description of the dataset, see arxiv.org/abs/2501.05583. The Python-based GitHub repository available at https://github.com/meiraiske/MPI-MNIST can be used for downloading the data from this website and preparing it for project use which includes an integration to PyTorch or PyTorch Lightning modules. File Structure  All data, except for the phantoms, is provided in the MDF file format. This format is specifically tailored to store MPI data and contains metadata corresponding to the experimental setup. The ground truth phantoms are provided as HDF5 files since they do not require any metadata.  SM: Contains twelve system matrices named SM_{physical model}_{resolution}.mdf. It covers four physical models given in three resolutions ('coarse', 'int' and 'fine'). The highest resolution ('fine') is used for data generation.  large_noise: Contains large_NoiseMeas.mdf with 390060 noise measurements. Each noise measurement has been averaged over ten empty scanner measurements. This can be used e.g. for learning-based methods.  For dataset in ['train', 'test']: {dataset}_noise: Contains four noise matrices, where each noise measurement has been averaged over ten empty scanner measurements:     1. NoiseMeas_phantom_{dataset}.mdf : Additive measurement noise for simulated measurements.    2. NoiseMeas_phantom_bg_{dataset}.mdf : Unused noise reserved for background correction of 1.     3. NoiseMeas_SM_{dataset}.mdf : System Matrix noise, that can be applied to each pixel of the reconstruction system matrix.    4. NoiseMeas_SM_bg_{dataset}.mdf : Unused noise reserved for background correction of 3.   {dataset}_gt: Contains {dataset}_gt.hdf5 with flattened and preprocessed ground truth MNIST phantoms given in coarse resolution (15x17=255 pixels) with pixel values in [0, 10].  {dataset}_obs: Contains {dataset}_obs.mdf with noise free simulated measurements (observations) of {dataset}_gt.hdf5 using the system matrix stored in SM_fluid_opt_fine.mdf.  {dataset}_obsnoisy: Contains {dataset}_obsnoisy.mdf with noise contained simulated measurements, resulting from {dataset}_obs.mdf and {dataset}_phantom_noise.mdf.  In line with MNIST, each MDF/HDF5 file in {dataset}_gt, {dataset}_obs, {dataset}_obsnoisy for dataset in ['train', 'test'] contains 60000 samples for 'train' and 10000 samples for 'test'. The data can be manually reproduced in the intermediate resolution (45x51=2295 pixels) from the files in this dataset using the system matrices in intermediate ('int') resolution for reconstruction and upsampling the ground truth phantoms by 3 pixels per dimension. This case is also implemented in the Github repository .  The PDF file MPI-MNIST_Metadata.pdf contains a list of meta information for each of the MDF files of this dataset.
创建时间:
2025-01-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作