COVID-19 Annotated Clinical Text (CACT) Corpus
收藏arXiv2021-03-11 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2012.00974v2
下载链接
链接失效反馈官方服务:
资源简介:
CACT数据集是由华盛顿大学创建的一个包含1,472条详细标注的临床文本数据集,专注于COVID-19的诊断、测试和症状描述。该数据集涵盖了多种临床记录类型,旨在通过自动信息提取模型,利用文本编码信息进行大规模研究。数据集的创建过程涉及详细的标注指南和多轮标注,以确保标注的准确性和一致性。CACT数据集的应用领域包括但不限于预测COVID-19测试结果,以及探索COVID-19的临床表现,旨在解决与COVID-19相关的诊断和治疗问题。
The CACT dataset is a meticulously annotated clinical text corpus containing 1,472 entries, created by the University of Washington, with a focus on COVID-19 diagnosis, testing, and symptom descriptions. This dataset encompasses a diverse range of clinical record types, and is designed to facilitate large-scale research leveraging text encoding information through automated information extraction models. The creation of the dataset involved detailed annotation guidelines and multiple rounds of annotation to ensure the accuracy and consistency of annotations. Application scenarios of the CACT dataset include, but are not limited to, predicting COVID-19 test results and exploring the clinical manifestations of COVID-19, with the goal of addressing issues related to COVID-19 diagnosis and treatment.
提供机构:
华盛顿大学
创建时间:
2020-12-02



