Urdu Handwritten Text Dataset
收藏Mendeley Data2024-01-31 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/bg2sctsysf
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains the images of handwritten text in Urdu language, one of the most widely spoken languages in South-East Asian regions. The native-speaking authors from different social domains were invited to write a pre-written text in their handwritings. The pre-written text is carefully written in a way that it includes almost all the characters, ligatures, diacritics, and dots used in writing the text Urdu script. The disabled persons are also involved to write the text to make the data collection more comprehensive. The demographic data of the authors is also recorded for supporting the research activities like author identification, text-matching etc.
本数据集包含乌尔都语(Urdu)手写文本图像,乌尔都语为东南亚地区使用最广泛的语种之一。本次数据采集邀请了来自不同社会领域的母语使用者,以手写形式抄写预设文本。该预设文本经过精心设计,涵盖了乌尔都语书写中使用的几乎全部字符、连字、变音符号与点符。为提升数据集的覆盖全面性,本次采集还纳入了残障人士作为书写者。此外,本数据集还记录了书写者的人口统计学信息,可用于支撑作者识别、文本匹配等相关研究活动。
创建时间:
2024-01-31



