Data for: Representing Oncology in Datasets: Standard or Custom Biomedical Terminology?
收藏doi.org2025-01-15 收录
下载链接:
http://doi.org/10.17632/xzbzz6czr7.1
下载链接
链接失效反馈官方服务:
资源简介:
We collected 250 cancer-related records that had already been coded using a custom terminology at Roche Inc (ROCHE). The purpose was to annotate substances of pharmacological interest, taking both anatomy (Roche.Anatomie) and histology (Roche.Histology) into account. 150 records were given to either of two medical students (CoderA and CoderB).
Coders were asked to identify representational units that expressed the same meaning in four target terminologies (SNOMED CT, NCIt, ICD-10 + ICD-O, and MedDRA) with as few codes as possible. Fifty of the cases were overlapping (double-coded), in order to enable the computation of inter-rater agreement.
For evaluation, we used the following definitions: Hit=At least one code was provided for that terminology. Agreement=Both coders provided exactly the same codes. Therefore, an absence of codes by both counted as an agreement, and a same primary code but different secondary code counted as a disagreement.
本团队搜集了250份与癌症相关的记录,这些记录已在罗氏公司(ROCHE)采用定制术语进行编码。收集的目的是对具有药理意义的物质进行标注,同时考虑了解剖学(Roche.Anatomie)和病理学(Roche.Histology)两个方面。150份记录分别分配给两位医学实习生(CoderA和CoderB)进行编码。编码人员被要求识别出在四种目标术语(SNOMED CT、NCIt、ICD-10 + ICD-O和MedDRA)中表示相同意义的代表性单元,并尽可能使用最少的编码。其中50个案例为重叠案例(双编码),以便进行评分者间一致性计算。在评估过程中,我们采用了以下定义:命中=至少提供了一个术语的编码。一致性=两位编码人员提供了完全相同的编码。因此,两位编码人员均未提供编码的情况视为一致,而相同的主要编码但次要编码不同的情况则视为不一致。
提供机构:
Mendeley Data



