RadCoref: Fine-tuning coreference resolution for different styles of clinical narratives
收藏DataCite Commons2024-01-30 更新2024-07-13 收录
下载链接:
https://physionet.org/content/rad-coreference-resolution/1.0.0/
下载链接
链接失效反馈官方服务:
资源简介:
RadCoref is a small subset of MIMIC-CXR with manually annotated coreference
mentions and clusters. The dataset is annotated by a panel of three cross-
disciplinary experts with experience in clinical data processing following the
i2b2 annotation scheme with minimum modification. The dataset consists of
Findings and Impression sections extracted from full radiology reports. The
dataset has 950, 25 and 200 section documents for training, validation, and
testing, respectively. The training and validation sets are annotated by one
annotator. The test set is annotated by two human annotators independently, of
which the results are merged manually by the third annotator. The dataset aims
to support the task of coreference resolution on radiology reports. Given that
the MIMIC-CXR has been de-identified already, no protected health information
(PHI) is included.
RadCoref 是 MIMIC-CXR 的一个小子集,包含经人工标注的共指提及与共指簇。该数据集由三名具备临床数据处理经验的跨学科专家组成的小组,遵循 i2b2 标注方案进行标注,仅做了最小程度的修改。数据集涵盖从完整放射学报告中提取的发现(Findings)与印象(Impression)两个章节。训练集、验证集与测试集分别包含950、25和200份章节文档。其中训练集与验证集由一名标注员完成标注;测试集由两名人类标注员独立标注,最终由第三名标注员手动合并两份标注结果。本数据集旨在支持放射学报告的共指消解任务。鉴于 MIMIC-CXR 已完成去标识化处理,数据集未包含任何受保护健康信息(Protected Health Information,PHI)。
提供机构:
PhysioNet
创建时间:
2024-01-26



