five

Review of Text Clustering Methods and Suggested Solutions for Theme Based Clustering of the Quran

收藏
Mendeley Data2020-01-22 更新2026-04-09 收录
下载链接:
https://data.mendeley.com/datasets/kb92kdjtcz
下载链接
链接失效反馈
官方服务:
资源简介:
In the datasets, documents of modern, unedited, and unmarked Arabic texts were utilised, which consisted of a sample of nearly 1,680 documents obtained from various online Arabic resources. The testing dataset comprised of four fields, namely: art, economics, politics, and sports articles. The other dataset collection consisted of 383,872 Arabic documents, which were primarily newswire dispatches as released by the Agency France Press (AFP) between years 1994 and 2000. Standard TREC classes and ground truth were thus established for this collection, whereby 10 classes were thus classified as part of TREC 2001. The last datasets is the Qur'an data, which converted from softcopy to database to contactes with each chapters and verses.
创建时间:
2020-01-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作