HipAAsynth/Twin_hospital_sample100
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/HipAAsynth/Twin_hospital_sample100
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- tabular-classification
- tabular-regression
tags:
- synthetic
- healthcare
- EHR
- deterministic
- hipaasynth
- twin-cohort
- bias-testing
- population-shift
version: "1.0.2"
size_categories:
- n<1K
configs:
- config_name: "hospital_a_patients"
data_files:
- split: train
path: "hospital_a_patients.csv"
default: true
- config_name: "hospital_b_patients"
data_files:
- split: train
path: "hospital_b_patients.csv"
---
# Twin Hospital Dataset
Two matched deterministic synthetic hospital cohorts with controlled demographic variation.
## What This Is
A matched pair of synthetic hospital populations from HipAAsynth. Hospital A skews older (age 55-95). Hospital B skews younger (age 30-75). Same engine, different demographics, natural condition prevalence differences. Designed for model comparison and bias testing.
## Contents
| File | Description |
|------|-------------|
| 00_twin_summary.csv | Side-by-side cohort comparison metrics |
| hospital_a_patients.csv | 100 patients, age 55-95, seed 4200 |
| hospital_b_patients.csv | 100 patients, age 30-75, seed 4201 |
| schema.json | Column definitions |
| twin_validation.txt | Validation report |
## Format
Standardized 13-column CSV with structured EHR-style fields:
| Column | Description |
|--------|-------------|
| patient_id | Unique patient identifier |
| age | Patient age |
| sex | Patient sex |
| ethnicity | Patient ethnicity |
| height_cm | Height in centimeters |
| weight_kg | Weight in kilograms |
| bmi | Body mass index |
| bmi_category | BMI classification |
| conditions | Pipe-delimited condition list |
| num_visits | Number of clinical visits |
| num_labs | Number of lab results |
| synthetic | Always True |
| disclaimer | Synthetic data disclaimer |
## Key Differences
| Metric | Hospital A | Hospital B |
|--------|-----------|-----------|
| Mean Age | 72.8 | 51.0 |
| Female % | 53.0% | 54.3% |
| Mean BMI | 28.7 | 29.2 |
| Diabetes | 22.9% | 15.1% |
| Hypertension | 72.9% | 52.8% |
| Heart Failure | 8.0% | 2.5% |
| COPD | 10.8% | 6.2% |
| CKD | 31.5% | 17.2% |
| Depression | 5.6% | 6.5% |
## Use Cases
- Model comparison across population shift
- Bias testing and fairness evaluation
- Deployment simulation
- A/B evaluation
## Reproducibility
Deterministic generation produces identical output across runs.
## License
Data Packs: CC BY-NC 4.0
Proprietary implementation. Structured, inspectable, auditable outputs. Purchase of data packs or outputs does not transfer ownership of the engine or implementation details. Commercial license and usage terms apply.
## Legal Disclaimer
HipAAsynth outputs are synthetic and contain no real patient data or protected health information. Products and datasets are intended for testing, development, research, and benchmarking. They are not intended for clinical decision-making, diagnosis, treatment, or patient care.
## Links
- Website: [HipAAsynth.com](https://hipaasynth.com)
- Products: [hipaasynth.com/#products](https://hipaasynth.com/#products)
- Request a Cohort: [hipaasynth.com/request](https://hipaasynth.com/request.html)
- Contact: [HipAAsynth@gmail.com](mailto:HipAAsynth@gmail.com)
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 1.0.2 | March 2026 | Added 00_twin_summary.csv. Fixed README. Updated to 100-patient samples. |
| 1.0.1 | March 2026 | Initial public release. |
提供机构:
HipAAsynth



