预测 ICU 2 型糖尿病患者呼吸机相关性肺炎的数据预处理、基线特征、变量相关分析和模型评估、TRIPODAI Guide
收藏Figshare2025-10-29 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Data_Preprocessing_Baseline_Characteristics_Variable_Correlation_Analysis_and_Model_Evaluation_for_Predicting_Ventilator-Associated_Pneumonia_in_ICU_Patients_with_Type_2_Diabetes_Mellitus/30454706/2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含一项针对 ICU 2 型糖尿病患者的研究的基线特征和补充数据,旨在使用机器学习预测呼吸机相关性肺炎 (VAP)。基线特征表总结了患者人口统计学、生命体征和实验室值。补充图说明了数据预处理步骤(四分位距清理前后的直方图和箱线图)、使用随机森林的缺失值插补、变量相关分析(Spearman 热图)和模型评估(四个预测模型的混淆矩阵)。这些数据提供了特征选择、清理程序和模型性能评估的详细概述。<b>Fig. S1</b>. Histograms and boxplots of Glucose_max and SBP_max in original and cleaned datasets: <b>Glucose_max</b>, maximum blood glucose; <b>SBP_max</b>, maximum systolic blood pressure. (A) original Glucose_max; (B) cleaned Glucose_max; (C) original SBP_max; (D) cleaned SBP_max. <b>Fig. S2</b>. Histograms and boxplots of Temp_min and WBC_min in original and cleaned datasets: <b>Temp_min</b>, minimum body temperature; <b>WBC_min</b>, minimum white blood cell count.(A)original Temp_min; (B)cleaned Temp_min; (C)original WBC_min; (D)cleaned WBC_min.<b>Fig. S3</b>. Histograms of PH_max and PH_min in original and Random Forest–imputed datasets: <b>PH_max</b>, maximum pH; <b>PH_min</b>, minimum pH.<b>Fig. S4</b>. Histograms of PO2_max and PO2_min in original and Random Forest–imputed datasets: <b>PO</b><b>2</b><b>_max</b>, maximum partial pressure of oxygen; <b>PO</b><b>2</b><b>_min</b>, minimum partial pressure of oxygen.<b>Fig. S5</b>. Histograms of PT_max and PT_min in original and Random Forest–imputed datasets: <b>PT_max</b>, maximum prothrombin time; <b>PT_min</b>, minimum prothrombin time.<b>Fig</b><b>. S6</b>. Spearman correlation heatmap of variables selected by both the Boruta algorithm and LASSO regression:<b>Hypertension</b>, history of hypertension; <b>Temp_min</b>, minimum body temperature; <b>Glusco_max</b>, maximum blood glucose; <b>Scr_max</b>, maximum serum creatinine; <b>WBC_min</b>, minimum white blood cell count;<b>CNS</b>, SOFA neurological subscore; <b>Renal</b>, SOFA renal subscore; and <b>GCS</b>, Glasgow Coma Scale.<b>Fig</b><b>. S7</b>. Confusion matrices of four predictive models: (A) Logistic Regression, (B) Random Forest, (C) XGBoost, and (D) Gradient Boosting Machine (<b>GBM</b>). Each matrix presents the counts of true positives, true negatives, false positives, and false negatives, facilitating model performance comparison.
提供机构:
Aoxing, Shi
创建时间:
2025-10-29



