five

CDAS

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/vb2w9h9msp
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains the analysis code for the study “Differential Factor Effects in Comorbid Depressive and Anxiety Symptoms (CDASs): A Machine Learning Approach to Individualized Mental Health Promotion.” Data Description: The data are derived from the Psychology and Behavior Investigation of Chinese Residents (PBICR) 2021, including 2,951 Chinese youth aged 19–35 years. The dataset integrates multidimensional variables covering demographic, psychological, family, social, behavioral, physiological and environmental factors (59 independent variables). The dependent variable represents four mental health categories: • 0 = No symptoms • 1 = Depressive symptoms only • 2 = Anxiety only • 3 = Comorbid Depressive and Anxiety (CDAS) For the final analysis, categories 1 and 2 were excluded to construct a binary classification (0 = asymptomatic, 1 = CDAS) focusing on identifying individuals with comorbid risks. Analytic Framework: All analyses were conducted in Python (version 3.8.8). The pipeline includes: 1. Data preprocessing – handling missing values, encoding categorical variables, and standardizing continuous ones. 2. Feature selection – combining LASSO and Boruta to select predictive factors. 3. Model training and evaluation – Random Forest (RF), XGBoost, LightGBM, CatBoost, Support Vector Machine (SVM), and Multilayer Perceptron (MLP) were trained under stratified cross-validation. 4. Model interpretation – SHAP (Shapley Additive exPlanations) was used to quantify feature importance and visualize group-specific factor effects. 5. Bidirectional hierarchical Clustering Analysis (HCA) – reveal underlying risk subtypes and characteristic combination patterns. 6. Decision Curve Analysis (DCA) – to evaluate clinical utility across risk thresholds. Key Findings: Random Forest demonstrated optimal performance (AUC = 0.905, PR-AUC = 0.703). SHAP analysis revealed that stress and swimming were core risk factors for comorbid symptoms. Neglecting or normalizing negative emotions, obesity-related dietary behaviors, and sports equipment usage were positively correlated with comorbid symptoms. Family health, social support, self-efficacy, and healthcare exerted protective effects. SHAP waterfall plots and SHAP bidirectional hierarchical clustering dendrogram provided individualized predictive explanations, while decision curve analysis confirmed the model's clinical net benefit within the threshold range of 0.02–0.67. How to Use: All code files can be executed sequentially. Each script is annotated for reproducibility. Researchers can apply for access via the official PBICR data portal.
创建时间:
2025-12-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作