C-REACT: Contextualized Race and Ethnicity Annotations for Clinical Text
收藏physionet.org2025-01-16 收录
下载链接:
https://physionet.org/content/race-ethnicity-clinical-text/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
The Contextualized Race and Ethnicity Annotations for Clinical Text (C-REACT) dataset is a large publicly available corpus of sentences from clinical notes manually annotated for information related to race and ethnicity (RE). The corpus presented here contains 17,281 sentences drawn from 12,000 patients and their clinical notes at the Beth Israel Deaconess Medical Center critical care units between 2001 and 2012. This corpus contains two sets of reference standard annotations for RE data. The first set contains granular RE-information such as patient country of origin and spoken language. The second set of annotations contains RE labels manually assigned by clinicians. This corpus is intended to improve understanding about granular information related to RE contained within the clinical note and how this information might be used to infer RE.
《临床文本中的情境化种族与民族注释》(C-REACT)数据集系一个庞大的公开可用语料库,由临床笔记中的句子构成,这些句子针对与种族和民族(RE)相关的信息进行了人工标注。本语料库包含17,281个句子,源自12,000名患者及其在贝斯以色列德科恩医疗中心重症监护单元2001年至2012年间的临床笔记。该语料库包含两套关于RE数据的参考标准标注集。第一套标注集包含细粒度的RE信息,如患者原籍国和使用的语言。第二套标注集包含临床医生人工分配的RE标签。本语料库旨在提升对临床笔记中包含的细粒度RE信息的理解,以及这些信息可能如何被用于推断RE。
提供机构:
physionet.org



