five

CPRD COVID-19 Symptoms and Risk Factors Synthetic Dataset

收藏
SAIL Databank2026-03-28 收录
下载链接:
https://wri-data-catalogue-worldresources.hub.arcgis.com/maps/685
下载链接
链接失效反馈
官方服务:
资源简介:
This wholly synthetic dataset is based on real anonymised primary care patient data extracted from the CPRD Aurum database. Researchers will not be able to access the real anonymised patient data extract which were used as the basis for the synthetic dataset generation to preserve patient privacy. The dataset focuses on patients presenting to primary care with symptoms indicative of COVID-19 (confirmed/suspected COVID-19) and control patients with negative COVID-19 test results. The dataset includes data on sociodemographic and clinical risk factors. The ‘ground truth’ CPRD Aurum data extract used as the basis for generating this synthetic dataset included data till 13/04/2021 on patients with either suspected or confirmed COVID-19 as ascertained from the primary care record. The ground truth data extract was subject to data pre-processing and as such, the synthetic dataset based on this, does not reflect the structure of the source CPRD Aurum database. The development of this synthetic dataset was funded by NHS X using the synthetic data generation and evaluation framework developed by CPRD under a grant from the Regulators’ Pioneer Fund launched by The Department for Business, Energy and Industrial Strategy (BEIS) and managed by Innovate UK. The methodology used to generate and evaluate this synthetic dataset is outlined in Wang et al. 2019 (DOI Bookmark:10.1109/CBMS.2019.00036).
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作