HipAAsynth/synthetic-india-tamil-nadu
收藏Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/HipAAsynth/synthetic-india-tamil-nadu
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- tabular-classification
- tabular-regression
tags:
- synthetic
- healthcare
- EHR
- deterministic
- hipaasynth
- india
- tamil-nadu
- ICMR
- NFHS-5
- diabetes
version: "1.0.0"
size_categories:
- n<1K
configs:
- config_name: default
data_files:
- split: train
path: "synthetic_india_tamil_nadu_sample_100_seed92.csv"
---
# Synthetic India — Tamil Nadu (Urban Chennai)
100 deterministic synthetic patients calibrated to Tamil Nadu urban epidemiological data.
## What This Is
A regional synthetic cohort from HipAAsynth calibrated to published Indian health statistics for urban Tamil Nadu. All rates sourced from ICMR-INDIAB, NFHS-5, WHO India, Census of India, IDF Diabetes Atlas, and ABDM. No estimates. No guesses.
## Cohort Profile
| Property | Value |
|----------|-------|
| State | Tamil Nadu |
| Region | Urban (Chennai) |
| Seed | 92 |
| Patients | 100 |
| Diabetes Prevalence | 26.4% (ICMR-INDIAB 2023) |
| Hypertension | 39.8% (NFHS-5) |
| Obesity | 40.3% (NFHS-5) |
| Female Anemia | 44.0% urban (NFHS-5) |
## Contents
| File | Description |
|------|-------------|
| synthetic_india_tamil_nadu_sample_100_seed92.csv | 100 synthetic patients |
| synthetic_india_tamil_nadu_sample_100_seed92.json | Same data in JSON with metadata |
## Columns
| Column | Description |
|--------|-------------|
| patient_id | Unique identifier (IN-TN-92-XXXX) |
| data_type | Always SYNTHETIC |
| country | India |
| state | Tamil Nadu |
| region_type | urban |
| age | Patient age (18+) |
| sex | F or M |
| language_region | Tamil |
| primary_diagnosis | Primary condition |
| secondary_diagnosis | Secondary condition |
| payer | PMJAY, Private Insurance, or Out of Pocket |
| diabetes_type | Type 1 or Type 2 (if diabetic) |
| hba1c | HbA1c value (if diabetic) |
| bmi | Body mass index |
| anemia | Boolean |
| tb_exposure | Boolean |
| vaccination_covid19 | Boolean |
| vaccination_polio | Boolean |
| vaccination_bcg | Boolean |
| healthcare_access_primary | Boolean |
| healthcare_access_secondary | Boolean |
| healthcare_access_tertiary | Boolean |
## Data Sources
| Statistic | Source |
|-----------|--------|
| Diabetes prevalence | ICMR-INDIAB 2023 Tamil Nadu chapter |
| Hypertension, obesity, anemia | NFHS-5 2019-2021 Tamil Nadu factsheet |
| TB incidence | WHO India Country Profile 2023 |
| Age distribution | Census of India 2021 projection |
| HbA1c uncontrolled rate | IDF Diabetes Atlas 10th Ed |
| Payer mix | Ayushman Bharat ABDM 2023 |
## Bilingual Data Notice
English: SYNTHETIC DATA ONLY. Not derived from real patients. No PHI. No IRB. No real patients. Ever.
Hindi: केवल सिंथेटिक डेटा। वास्तविक रोगियों से प्राप्त नहीं। कोई PHI नहीं। कोई IRB नहीं। कोई वास्तविक मरीज़ नहीं। कभी नहीं।
## Reproducibility
Deterministic generation produces identical output across runs. Same seed = same dataset, always.
## License
Data Packs: CC BY-NC 4.0
Proprietary implementation. Structured, inspectable, auditable outputs. Purchase of data packs or outputs does not transfer ownership of the engine or implementation details. Commercial license and usage terms apply.
## Legal Disclaimer
HipAAsynth outputs are synthetic and contain no real patient data or protected health information. Products and datasets are intended for testing, development, research, and benchmarking. They are not intended for clinical decision-making, diagnosis, treatment, or patient care.
## Links
- Website: [HipAAsynth.com](https://hipaasynth.com)
- Products: [hipaasynth.com/#products](https://hipaasynth.com/#products)
- Request a Cohort: [hipaasynth.com/request](https://hipaasynth.com/request.html)
- Contact: [HipAAsynth@gmail.com](mailto:HipAAsynth@gmail.com)
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 1.0.0 | March 2026 | Initial release. 100 patients, seed 92. |
提供机构:
HipAAsynth



