Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database
收藏DataCite Commons2024-01-25 更新2024-07-13 收录
下载链接:
https://physionet.org/content/annotation-dataset-sdoh/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
Social determinants of health (SDoH) have an important impact on patient
outcomes but are incompletely collected from the electronic health records
(EHR). This study researched the ability of large language models to extract
SDoH from free text in EHRs, where they are most commonly documented, and
explored the role of synthetic clinical text for improving the extraction of
these scarcely documented, yet extremely valuable, clinical data. We developed
annotation guidelines for sentence-level annotation of SDoH that are not
reliably available as structured data in the EHR: employment, housing,
transportation, parental status, relationship, and social support. Sentences
were labeled for both the presence of an SDoH mention and the presence of an
adverse SDoH mention. After finalizing the annotation guidelines, two
annotators manually annotated a separate corpus, which cannot be released due
to PHI. A total of 300/800 (37.5%) of these notes underwent dual annotation.
Before adjudication, dually-annotated notes had a Krippendorf's alpha
agreement of 0.86 and Cohen's Kappa of 0.86 for any SDoH mention categories.
For adverse SDoH mentions, notes had a Krippendorf's alpha agreement of 0.76
and Cohen's Kappa of 0.76. As an external validation, 200 notes from MIMIC-III
written by physicians, social workers, and nurses were manually annotated by a
single annotator. Here, we release this manually annotated corpus of 200 MIMC-
III notes.
提供机构:
PhysioNet
创建时间:
2023-11-18



