five

Annotated Corpus of PubMed Abstracts

收藏
arXiv2019-12-04 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1912.01831v1
下载链接
链接失效反馈
官方服务:
资源简介:
本研究构建了一个包含750篇PubMed摘要的标注语料库,旨在研究治疗和物质报告的效果。该数据集由索菲亚大学数学与信息科学学院等机构创建,每篇摘要都标注了关于治疗或物质效果的正面、负面或中性描述。数据集的创建涉及自动处理和人工标注活动,特别关注医学术语和缩写的识别。该数据集主要用于训练文本分类器,以识别PubMed摘要中讨论的效果,目前分类器的准确率为78.80%。此外,该数据集还可能应用于其他相关研究领域,以提高对医学文本的理解和处理能力。

This study constructs an annotated corpus containing 750 PubMed abstracts, aiming to investigate the effects reported for treatments and substances. This dataset was developed by institutions including the School of Mathematics and Information Sciences of Sofia University and other relevant organizations. Each abstract is annotated with positive, negative, or neutral descriptions regarding the effects of treatments or substances. The creation of this dataset involved both automatic processing and manual annotation activities, with special focus on the recognition of medical terms and abbreviations. This dataset is mainly used for training text classifiers to identify the discussed effects in PubMed abstracts, and the current accuracy of the classifier reaches 78.80%. In addition, this dataset can also be applied to other related research fields to improve the understanding and processing abilities of medical texts.
提供机构:
索菲亚大学数学与信息科学学院
创建时间:
2019-12-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作