Comprehensive Maternal Health Dataset for Supervised and Unsupervised Analysis
收藏DataONE2025-06-09 更新2025-11-01 收录
下载链接:
https://search.dataone.org/view/sha256:560ed1885d42030a81a6f94d1d76e35e2a7ee6a32ad16f7f8ad33f436bb63c2b
下载链接
链接失效反馈官方服务:
资源简介:
This comprehensive dataset combines two related collections of maternal health records, providing a rich resource for both supervised and unsupervised machine learning tasks. It is designed to facilitate a wide range of research in maternal health, from predictive modeling to pattern discovery. The dataset includes a variety of factors such as demographics, pre-existing conditions, and clinical measurements. A subset of the data includes a 'risk_level' target variable, making it suitable for supervised learning models to predict maternal health risks. The remainder of the data is unlabeled, ideal for unsupervised learning techniques like clustering and anomaly detection to uncover hidden patterns in maternal health data. Columns: age: Age of the mother in years. bmi: Body Mass Index. blood_pressure: Systolic and diastolic blood pressure. gestational_age: Gestational age in weeks. previous_c_section: History of previous Caesarean section (0 for no, 1 for yes). previous_miscarriages: Number of previous miscarriages. previous_preterm_birth: History of previous preterm birth (0 for no, 1 for yes). chronic_hypertension: History of chronic hypertension (0 for no, 1 for yes). diabetes: History of diabetes (0 for no, 1 for yes). gestational_diabetes: History of gestational diabetes (0 for no, 1 for yes). preeclampsia_history: History of preeclampsia (0 for no, 1 for yes). multiple_pregnancy: Whether the patient is expecting multiple babies (0 for no, 1 for yes). smoking: Smoking status (0 for no, 1 for yes). alcohol_use: Alcohol consumption (0 for no, 1 for yes). family_history: Family history of relevant conditions (0 for no, 1 for yes). hb_level: Hemoglobin level. urine_protein: Urine protein level. blood_glucose: Blood glucose level. risk_level: The target variable indicating the risk level (high, moderate, low). This column will have missing values for the unsupervised portion of the dataset.
创建时间:
2025-10-29



