DFKI-SLT/OptimAL
收藏OptimALBaselineDataset 数据集概述
数据集描述
数据集概要
OptimALBaselineDataset 数据集用于药物发现和临床决策支持。该数据集通过结合弱监督(程序化标注和众包)和深度学习方法,从 DailyMed 文本中提取药物-疾病关系,生成高质量的药物-疾病关系数据。生成的数据与 DrugCentral(一个手工 curated 的数据集)有高度重叠。使用该数据集,构建了一个机器学习模型,用于从文本中分类药物和疾病之间的关系,分为四个类别:治疗、缓解症状、矛盾和效果。
语言
数据集中的语言为英语。
数据集结构
数据实例
一个 train 数据实例的示例如下: json { "_unit_id": 2270472226, "Worker Answer": "effect", "context": "(See INDICATIONS AND USAGE and WARNINGS.) Experience in over 1,400 patients with nifedipine immediate-release capsules in a noncomparative clinical trial has shown that concomitant administration of nifedipine and beta-blocking agents is usually well tolerated, but there have been occasional literature reports suggesting that the combination may increase the likelihood of congestive heart failure, severe hypotension, or exacerbation of angina.", "drug_name": "Nifedipine", "disease_name": "CONGESTIVE HEART FAILURE" }
数据字段
_unit_id: 数据条目的唯一标识符,类型为int64。Worker Answer: 工人根据上下文提供的答案或分类,类型为string。context: 提供药物和疾病交互场景的文本,类型为string。drug_name: 上下文中讨论的药物名称,类型为string。disease_name: 上下文中与药物相关的疾病名称,类型为string。
引用
BibTeX
@article{SHINGJERGJI2021103902, title = {Relation extraction from DailyMed structured product labels by optimally combining crowd, experts and machines}, journal = {Journal of Biomedical Informatics}, volume = {122}, pages = {103902}, year = {2021}, issn = {1532-0464}, doi = {https://doi.org/10.1016/j.jbi.2021.103902}, url = {https://www.sciencedirect.com/science/article/pii/S1532046421002318}, author = {Krist Shingjergji and Remzi Celebi and Jan Scholtes and Michel Dumontier}, keywords = {Drug-disease relation classification, Drug indications, Drug data quality, Drug repositioning, Weak supervision, Programmatic labeling, Crowdsourcing, Human-in-the-loop, Machine learning}, }
APA
- Shingjergji, K., Celebi, R., Scholtes, J., & Dumontier, M. (2021). Relation extraction from DailyMed structured product labels by optimally combining crowd, experts and machines. Journal of Biomedical Informatics, 122, 103902. https://doi.org/10.1016/j.jbi.2021.103902



