five

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

收藏
DataCite Commons2024-01-25 更新2024-07-13 收录
下载链接:
https://physionet.org/content/annotation-dataset-sdoh/1.0.0/
下载链接
链接失效反馈
官方服务:
资源简介:
Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHR). This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented, and explored the role of synthetic clinical text for improving the extraction of these scarcely documented, yet extremely valuable, clinical data. We developed annotation guidelines for sentence-level annotation of SDoH that are not reliably available as structured data in the EHR: employment, housing, transportation, parental status, relationship, and social support. Sentences were labeled for both the presence of an SDoH mention and the presence of an adverse SDoH mention. After finalizing the annotation guidelines, two annotators manually annotated a separate corpus, which cannot be released due to PHI. A total of 300/800 (37.5%) of these notes underwent dual annotation. Before adjudication, dually-annotated notes had a Krippendorf's alpha agreement of 0.86 and Cohen's Kappa of 0.86 for any SDoH mention categories. For adverse SDoH mentions, notes had a Krippendorf's alpha agreement of 0.76 and Cohen's Kappa of 0.76. As an external validation, 200 notes from MIMIC-III written by physicians, social workers, and nurses were manually annotated by a single annotator. Here, we release this manually annotated corpus of 200 MIMC- III notes.
提供机构:
PhysioNet
创建时间:
2023-11-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作