five

Urdu Nastalique

收藏
IEEE2019-02-27 更新2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/urdu-nastalique
下载链接
链接失效反馈
官方服务:
资源简介:
The performance of most of the classification models is dependent on the data used for training. The data must be reliable, robust and meticulously labelled. In order to form such a data a systematical approach has been designed and moreover, it should be. The data set was collected from a well-known source, namely Center for Language Engineering available at http://www.cle.org.pk. The corpus available on the website used for prediction contains Urdu Naskh data having 4,325 number of lines and 1, 22284 words. This corpus contains three text files. The mentioned corpus is converted into Jameel Noori Urdu Nastalique font style having 4,325 number of lines and 1, 22284 words. Due to context sensitive nature of Urdu Nastalique it poses several challenges. The mentioned corpus text is converted into images because in OCR systems ligature segmentation and line segmentation of images is itself a challenging task.
创建时间:
2019-02-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作