Adverse Drug Events (ADE) Corpus
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/Adverse_Drug_Events_ADE_Corpus
下载链接
链接失效反馈官方服务:
资源简介:
开发基准语料库以支持从医疗案例报告中自动提取与药物相关的不良反应。大量关于药物相关安全问题(例如副作用)的信息发布在医疗案例报告中,由于其非结构化性质,这些信息只能由人类读者探索。这里介绍的工作旨在生成一个系统注释的语料库,该语料库可以支持从医学病例报告中自动提取药物相关不良反应的方法的开发和验证。文档在多轮系统中进行了双重注释,以确保注释的一致性。带注释的文档最终被协调以生成具有代表性的共识注释。为了演示一个示例用例场景,使用语料库来训练和验证模型,以针对非信息性句子对信息性进行分类。使用简单特征训练并通过 10 倍交叉验证进行评估的最大熵分类器的 F₁ 得分为 0.70,表明该语料库的潜在有用应用。
Herein, we present the development of a benchmark corpus to support the automatic extraction of drug-related adverse events from medical case reports. A substantial volume of medication-related safety information (e.g., side effects) is published in medical case reports; however, due to their unstructured nature, such information can only be manually explored by human readers. The work described herein aims to create a systematically annotated corpus that supports the development and validation of automated methods for extracting drug-related adverse events from medical case reports. Documents were double-annotated in a multi-round annotation system to ensure annotation consistency. The individually annotated documents were finally reconciled to generate representative consensus annotations. To demonstrate a sample use case scenario, the corpus was used to train and validate a model for classifying sentences as informative versus non-informative. A maximum entropy classifier trained with simple features and evaluated via 10-fold cross-validation achieved an F1 score of 0.70, which illustrates the potential practical utility of this corpus.
提供机构:
OpenDataLab
创建时间:
2022-08-16



