pGAN Synthetic Dataset: A Deep Learning Approach to Private Data Sharing of Medical Images Using Conditional GANs
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5031880
下载链接
链接失效反馈官方服务:
资源简介:
Synthetic dataset for A Deep Learning Approach to Private Data Sharing of Medical Images Using Conditional GANs
Dataset specification:
MRI images of Vertebral Units labelled based on region
Dataset is comprised of 10000 pairs of images and labels
Image and label pair number k can be selected by: synthetic_dataset['images'][k] and synthetic_dataset['regions'][k]
Images are 3D of size (9, 64, 64)
Regions are stored as an integer. Mapping is 0: cervical, 1: thoracic, 2: lumbar
Arxiv paper: https://arxiv.org/abs/2106.13199
Github code: https://github.com/tcoroller/pGAN/
Abstract:
Sharing data from clinical studies can facilitate innovative data-driven research and ultimately lead to better public health. However, sharing biomedical data can put sensitive personal information at risk. This is usually solved by anonymization, which is a slow and expensive process. An alternative to anonymization is sharing a synthetic dataset that bears a behaviour similar to the real data but preserves privacy. As part of the collaboration between Novartis and the Oxford Big Data Institute, we generate a synthetic dataset based on COSENTYX Ankylosing Spondylitis (AS) clinical study. We apply an Auxiliary Classifier GAN (ac-GAN) to generate synthetic magnetic resonance images (MRIs) of vertebral units (VUs). The images are conditioned on the VU location (cervical, thoracic and lumbar). In this paper, we present a method for generating a synthetic dataset and conduct an in-depth analysis on its properties of along three key metrics: image fidelity, sample diversity and dataset privacy.
创建时间:
2021-06-26



