five

The use of text-matching software’s similarity scores

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/The_use_of_text-matching_software_s_similarity_scores/16823827
下载链接
链接失效反馈
官方服务:
资源简介:
Popular text-matching software generates a percentage of similarity – called a “similarity score” or “Similarity Index” – that quantifies the matching text between a particular manuscript and content in the software’s archives, on the Internet and in electronic databases. Many evaluators rely on these simple figures as a proxy for plagiarism and thus avoid the burdensome task of inspecting the longer, detailed Similarity Reports. Yet similarity scores, though alluringly straightforward, are never enough to judge the presence (or absence) of plagiarism. Ideally, evaluators should always examine the Similarity Reports. Given the persistent use of simplistic similarity score thresholds at some academic journals and educational institutions, however, and the time that can be saved by relying on the scores, a method is arguably needed that encourages examining the Similarity Reports but still also allows evaluators to rely on the scores in some instances. This article proposes a four-band method to accomplish this. Used together, the bands oblige evaluators to acknowledge the risk of relying on the similarity scores yet still allow them to ultimately determine whether they wish to accept that risk. The bands – for most rigor, high rigor, moderate rigor and less rigor – should be tailored to an evaluator’s particular needs.
创建时间:
2021-10-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作