Supplementary Material for: Early Prediction of Adverse Stroke Outcomes using Non-clinical Factors and Missing Data: A Machine Learning Study
收藏Figshare2026-02-07 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_Material_for_Early_Prediction_of_Adverse_Stroke_Outcomes_using_Non-clinical_Factors_and_Missing_Data_A_Machine_Learning_Study/31287007
下载链接
链接失效反馈官方服务:
资源简介:
Introduction Early prediction of stroke outcomes using prognostic tools may help clinical decision making and inform resource allocation. However, clinical information required to inform prediction tools is often missing. We evaluated the performance of machine learning (ML) prediction models of adverse stroke outcome at 90 days post-admission that exploit non-clinical data, and missingness, alongside traditional clinical and demographic predictors. Methods We used routine hospital data from UK clinical sites (NHS SafeHaven) to train three Gradient Boosted (GBM) models. We compared baseline clinical features with non-clinical features and missingness to predict a composite 90-day adverse stroke outcome: mortality, stroke recurrence or new care-home discharge. Model validation used 10% of the data. Model performance was evaluated by accuracy (correct predictions/total predictions) and Area Under the Receiver Operating Characteristics (ROC) Curve (AUC) while DeLong’s test was used to compare performance of the three models. We used Brier score to evaluate model calibration. SHapley Additive exPlanations (SHAP) analyses determined the contribution of each model feature in predicting adverse stroke outcome. Results The final sample included 3530 stroke patients with 51% males (mean age=72 years; SD=14). Clinical data were incomplete with five clinical features having >63% missing values. The performance of the three models was not significantly different (p=0.5 to 0.9). The model with non-clinical and missingness features demonstrated 71% accuracy and AUC of 0.76 with Brier score of 0.19. Non-clinical factors, such as time to clinical assessment and time to admission, were amongst the five most important predictors of adverse stroke outcome (mean |SHAP|=0.03 and 0.05), alongside Glasgow Coma Scale (0.08), age (0.03) and temperature (0.02). Missing clinical values (pulse and LDL) predicted adverse stroke outcome (mean |SHAP|=0.02 and 0.02) and were correlated with age (ρ=0.2), arrival by ambulance (ρ=0.3), length of stay (ρ=-0.3) and Transient Ischemic Attack (ρ=0.3). Conclusion We demonstrate that non-clinical factors and missingness of data can assist in early predictions of 90-day adverse stroke outcomes. As these factors are often well documented in electronic health systems they could complement or supplement traditional clinical predictive factors.
创建时间:
2026-02-07



