PHT_Simulated_health_data.csv
收藏DataCite Commons2020-08-28 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/PHT_Simulated_health_data_csv/7379810/2
下载链接
链接失效反馈官方服务:
资源简介:
This simulated dataset is from a publicly available simulated dataset which contains variables that could be interpreted as sex, BMI, number of children, smoking status, region, and reimbursement information of patients [https://edu.kpfu.ru/pluginfile.php/278552/mod_resource/content/1/MachineLearningR__Brett_Lantz.pdf]. Additionally, we generated artificial personal identifiers including date of birth, zipcode, house number, and sex for record linkage by using a Python tool called Faker [https://faker.readthedocs.io/en/master/].First, this dataset is vertically split over two providers: Provider A has artificial personal identifiers (including date of birth, zipcode, house number, and sex), BMI, number of children, and smoking status; Provider B has the same artificial personal identifiers (including date of birth, zipcode, house number, and sex) and living region, and reimbursement information.Second, we simulated that Provider A only hosts a small number of patients, i.e., 1338 patients, while Provider B hosts about 64400 patients (which includes Provider A’s patients too).
提供机构:
figshare
创建时间:
2018-11-25



