five

PHT_Simulated_health_data.csv

收藏
Figshare2018-11-25 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/PHT_Simulated_health_data_csv/7379810/1
下载链接
链接失效反馈
官方服务:
资源简介:
This simulated dataset is from a publicly available simulated dataset which contains variables that could be interpreted as sex, BMI, number of children, smoking status, region, and reimbursement information of patients [https://edu.kpfu.ru/pluginfile.php/278552/mod_resource/content/1/MachineLearningR__Brett_Lantz.pdf]. Additionally, we generated artificial personal identifiers including date of birth, zipcode, house number, and sex for record linkage by using a Python tool called Faker [https://faker.readthedocs.io/en/master/].First, this dataset is vertically split over two providers: Provider A has artificial personal identifiers (including date of birth, zipcode, house number, and sex), BMI, number of children, and smoking status; Provider B has the same artificial personal identifiers (including date of birth, zipcode, house number, and sex) and living region, and reimbursement information.Second, we simulated that Provider A only hosts a small number of patients, i.e., 1338 patients, while Provider B hosts about 400,000 patients (which includes Provider A’s patients too).
创建时间:
2018-11-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作