Automated extraction of post-stroke functional outcomes from unstructured electronic health records
收藏DataCite Commons2025-10-02 更新2026-02-08 收录
下载链接:
https://bdsp.io/content/j004ky2i4eijzhs36ubw/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
### Purpose:
Population level tracking of post-stroke functional outcomes is critical to
guide interventions that reduce the burden of stroke-related disability.
However, functional outcomes are often missing or documented in unstructured
notes. We developed a natural language processing (NLP) model that reads
electronic health records (EHR) notes to automatically determine the modified
Rankin Scale (mRS).
### Method:
We included consecutive patients (⩾18 years) with acute stroke admitted to our
center (2015-2024). mRS scores were obtained from the Get With the Guidelines
registry and clinical notes (if documented), and used as the gold standard to
compare against NLP-generated scores. We used text-based features from notes,
along with age, sex, discharge status, and outpatient follow-up to train a
logistic regression for prediction of good (0-2) versus poor (3-6) mRS, and a
linear regression for the full range of mRS scores. The models were trained
for prediction of mRS at hospital discharge and post-discharge. The models
were externally validated in a dataset of patients with brain injuries from a
different healthcare center.
### Findings:
We included 5307 patients, 5006 in train and test and 301 in validation;
average age was 69 (SD 15) and 65 (SD 17) years, respectively; 47% female. The
logistic regression achieved an area under the receiver operating curve
(AUROC) of 0.94 [CI 0.93-0.95] (test) and 0.94 [0.91-0.96] (validation), and
the linear model a root mean squared error (RMSE) of 0.91 [0.87-0.94] (test)
and 1.17 [1.06-1.28] (validation).
### Discussion and Conclusion:
The NLP-based model is suitable for use in large-scale phenotyping of stroke
functional outcomes and population health research.
提供机构:
BDSP
创建时间:
2025-09-26



