five

AI-Ready Dataset for Socioeconomic Clustering and Preterm Birth Risk Analysis in Brazil

收藏
DataCite Commons2026-04-20 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/j4cr8kszxn
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides an integrated, municipality-level resource for analyzing the relationship between socioeconomic conditions and preterm birth (PTB) risk across Brazil. It was constructed by combining multiple large-scale public datasets, including SINASC (birth and gestational data), Cadastro Único (individual and household socioeconomic data), and IBGE population estimates. These sources were processed and merged into a unified structure representing Brazilian municipalities through a comprehensive set of socioeconomic indicators. The final dataset comprises 5,529 municipalities and 104 features describing income, education, housing conditions, sanitation, employment, and demographic characteristics. A derived metric, the Preterm Birth Municipal Rate (PMR), was computed to quantify PTB risk at the municipal level. The dataset includes both raw and processed versions, such as normalized data and PCA-reduced representations, enabling direct use in machine learning workflows and facilitating reproducibility. In addition to the dataset, this submission includes the full set of preprocessing pipelines, scripts, and configurations required to reproduce the data construction process and subsequent analyses. It also provides implementations of unsupervised learning approaches, including Self-Organizing Maps and hierarchical clustering with Dynamic Tree Cut, allowing researchers to replicate, validate, and extend the methodology. This resource is suitable for studies in artificial intelligence, public health, and socioeconomic analysis, particularly those focused on maternal and child health.
提供机构:
Mendeley Data
创建时间:
2026-04-20
二维码
社区交流群
二维码
科研交流群
商业服务