Language Resources for Intrinsic Plagiarism Detection in Urdu Language
收藏Mendeley Data2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/8fknny5s5p
下载链接
链接失效反馈官方服务:
资源简介:
This is a dataset based on the intrinsic plagiarism . To produce a high-quality dataset to train the classification algorithm, we have gathered the Urdu essays and reports from various popular and highly trending websites such as, jang.com, urduessaypoint.blogspot.com, www.dawnnews.tv etc. All the documents gathered from the websites are then compiled in .txt format. More than 2500 plagiarized and non plagiarized documents are created systematically



