Automatic recognition of self-acknowledged limitations in clinical research literature
收藏DataONE2020-06-24 更新2025-05-03 收录
下载链接:
https://search.dataone.org/view/sha256:09aa316284227753969edf0d3c60fef0510b8e1672de9bd39cfa72a6bcd1effe
下载链接
链接失效反馈官方服务:
资源简介:
Objective: To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency.
Materials and Methods: To develop our recognition methods, we used a set of 8,431 sentences from 1,197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM).
Results: Annotators had good agreement in labeling limitation sentences...
研究目标:自动识别临床研究论文中作者自述的研究局限性,为提升研究透明度的相关工作提供支撑。
材料与方法:为开发本研究的识别方法,我们使用了1197篇PubMed中心(PubMed Central)文章中的8431条句子。其中部分句子经人工标注用于训练与测试,并计算了标注者间一致性。我们将该识别任务建模为二分类任务,即判定论文中的给定句子是否涉及作者自述的研究局限性。我们测试了三种方法:基于文档结构的规则方法、监督式机器学习方法,以及通过自训练扩展训练集以提升分类性能的半监督方法。所采用的机器学习算法包括逻辑回归(logistic regression,LR)与支持向量机(support vector machines,SVM)。
结果:标注者在对局限性相关句子进行标注时具备良好的一致性……
创建时间:
2025-04-19



