five

HipAAsynth/Twin_hospital_sample100

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/HipAAsynth/Twin_hospital_sample100
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - tabular-classification - tabular-regression tags: - synthetic - healthcare - EHR - deterministic - hipaasynth - twin-cohort - bias-testing - population-shift version: "1.0.2" size_categories: - n<1K configs: - config_name: "hospital_a_patients" data_files: - split: train path: "hospital_a_patients.csv" default: true - config_name: "hospital_b_patients" data_files: - split: train path: "hospital_b_patients.csv" --- # Twin Hospital Dataset Two matched deterministic synthetic hospital cohorts with controlled demographic variation. ## What This Is A matched pair of synthetic hospital populations from HipAAsynth. Hospital A skews older (age 55-95). Hospital B skews younger (age 30-75). Same engine, different demographics, natural condition prevalence differences. Designed for model comparison and bias testing. ## Contents | File | Description | |------|-------------| | 00_twin_summary.csv | Side-by-side cohort comparison metrics | | hospital_a_patients.csv | 100 patients, age 55-95, seed 4200 | | hospital_b_patients.csv | 100 patients, age 30-75, seed 4201 | | schema.json | Column definitions | | twin_validation.txt | Validation report | ## Format Standardized 13-column CSV with structured EHR-style fields: | Column | Description | |--------|-------------| | patient_id | Unique patient identifier | | age | Patient age | | sex | Patient sex | | ethnicity | Patient ethnicity | | height_cm | Height in centimeters | | weight_kg | Weight in kilograms | | bmi | Body mass index | | bmi_category | BMI classification | | conditions | Pipe-delimited condition list | | num_visits | Number of clinical visits | | num_labs | Number of lab results | | synthetic | Always True | | disclaimer | Synthetic data disclaimer | ## Key Differences | Metric | Hospital A | Hospital B | |--------|-----------|-----------| | Mean Age | 72.8 | 51.0 | | Female % | 53.0% | 54.3% | | Mean BMI | 28.7 | 29.2 | | Diabetes | 22.9% | 15.1% | | Hypertension | 72.9% | 52.8% | | Heart Failure | 8.0% | 2.5% | | COPD | 10.8% | 6.2% | | CKD | 31.5% | 17.2% | | Depression | 5.6% | 6.5% | ## Use Cases - Model comparison across population shift - Bias testing and fairness evaluation - Deployment simulation - A/B evaluation ## Reproducibility Deterministic generation produces identical output across runs. ## License Data Packs: CC BY-NC 4.0 Proprietary implementation. Structured, inspectable, auditable outputs. Purchase of data packs or outputs does not transfer ownership of the engine or implementation details. Commercial license and usage terms apply. ## Legal Disclaimer HipAAsynth outputs are synthetic and contain no real patient data or protected health information. Products and datasets are intended for testing, development, research, and benchmarking. They are not intended for clinical decision-making, diagnosis, treatment, or patient care. ## Links - Website: [HipAAsynth.com](https://hipaasynth.com) - Products: [hipaasynth.com/#products](https://hipaasynth.com/#products) - Request a Cohort: [hipaasynth.com/request](https://hipaasynth.com/request.html) - Contact: [HipAAsynth@gmail.com](mailto:HipAAsynth@gmail.com) ## Version History | Version | Date | Changes | |---------|------|---------| | 1.0.2 | March 2026 | Added 00_twin_summary.csv. Fixed README. Updated to 100-patient samples. | | 1.0.1 | March 2026 | Initial public release. |
提供机构:
HipAAsynth
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作