Machine learning identifies girls with central precocious puberty based on multi-source data
收藏DataCite Commons2025-04-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.bk3j9kd99
下载链接
链接失效反馈官方服务:
资源简介:
Objective: The study aimed to develop simplified diagnostic models for
identifying girls with central precocious puberty (CPP), without the
expensive and cumbersome gonadotropin-releasing hormone (GnRH) stimulation
test, which is the gold standard for CPP diagnosis. Materials and Methods:
Female patients who had secondary sexual characteristics before 8 years
old and had taken a GnRH analog (GnRHa) stimulation test at a medical
center in Guangzhou, China were enrolled. Data from clinical visiting,
laboratory tests and medical image examinations were collected. We first
extracted features from unstructured data such as clinical reports and
medical images. Then, models based on each single-source data or
multi-source data were developed with Extreme Gradient Boosting (XGBoost)
classifier to classify patients as CPP or non-CPP. Results: The best
performance achieved an AUC of 0.88 and Youden index of 0.64 in the model
based on multi-source data. The performance of single-source models based
on data from basal laboratory tests and the feature importance of each
variable showed that the basal hormone test had the highest diagnostic
value for a CPP diagnosis. Conclusion: We developed three simplified
models that use easily accessed clinical data before the GnRH stimulation
test to identify girls who are at high risk of CPP. These models are
tailored to the needs of patients in different clinical settings. Machine
learning technologies and multi-source data fusion can help to make a
better diagnosis than traditional methods.
提供机构:
Dryad
创建时间:
2021-01-05



