five

PubMed医学数据集: 包含提取的 3984 个医学句子

收藏
帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26336.html
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含从 PubMed 摘要中提取的 3984 个医学句子,并标注了不同医学术语之间的关系。该数据集主要关注“治疗”和“病因”关系,其中 1043 个句子包含治疗关系,1787 个句子包含病因关系。 人工标注者提供两个不同的术语(例如“路易体痴呆”和“形式良好的视幻觉”),并要求他们标记这些术语之间的关系(在本例中,“路易体痴呆”会导致形式良好的视幻觉。) 该语料库已在以下论文中引用:

This dataset comprises 3,984 medical sentences extracted from PubMed abstracts, with relationships between distinct medical terms annotated. It primarily focuses on two categories of relationships: treatment relationships and causal (etiologic) relationships. Specifically, 1,043 sentences are labeled with treatment relationships, while 1,787 sentences are labeled with causal relationships. Human annotators are provided with two distinct medical terms (e.g., "Lewy body dementia" and "formed visual hallucinations") and are required to mark the relationship between these two terms. As an example, "Lewy body dementia" causes "formed visual hallucinations". This corpus has been cited in the following papers:
提供机构:
帕依提提
二维码
社区交流群
二维码
科研交流群
商业服务