NIDX: A Machine Learning Approach for Identifying People with Neuroinfectious Diseases in Electronic Health Records
收藏DataCite Commons2025-05-31 更新2025-06-14 收录
下载链接:
https://bdsp.io/content/nidx/1.0/
下载链接
链接失效反馈官方服务:
资源简介:
We developed a machine learning model to identify neuroinfectious diseases
(NID) from electronic health record notes. Using 3,000 notes from Mass General
Brigham, we trained an XGBoost model on text features extracted from clinical
documentation. Our model achieved excellent performance with an AUROC of 0.977
and AUPRC of 0.894 on internal validation, significantly outperforming both
ICD-code based identification (which showed high sensitivity of 97.1% but poor
specificity of 59.1%) and zero-shot classification using LLaMA 3.2 (AUROC
0.80). The model maintained strong performance when tested on an external
dataset of 600 notes from Beth Israel Deaconess Medical Center (AUROC 0.976,
AUPRC 0.779). This approach provides an accurate, automated method for
identifying patients with neuroinfectious diseases from clinical notes,
enabling more precise cohort creation for research compared to traditional ICD
code-based methods.
提供机构:
BDSP
创建时间:
2025-05-31



