Nigerian Language Dataset (Wa-Zo-Bia)
收藏doi.org2022-09-05 更新2025-03-24 收录
下载链接:
http://doi.org/10.17632/jccjsk6pd3.2
下载链接
链接失效反馈官方服务:
资源简介:
This is a dataset that contains the alphabets of the most common Nigerian languages from start to finish and can be used for character recognition. It was recorded physically and has been binarized, while some has not. The handwriting of 50 students was captured for both uppercase and lowercase for each of the languages.
The dataset file: This file contains the raw images of the dataset; that is why it is the largest file.
The binary file: This contains the raw data converted into binary format with a threshold of 210. This is why it is the smallest file.
The sorted file: This file contains the sorted images, i.e., a folder was created for all the 'A' alphabets and so on till 'Z'. That is why it is different from the binary file. All you have to do is download the one you choose to use, and then unzip.
The resized file: This contains all the images that have been resized to a specific dimension.
Due to the existence of different contributors to the datasets, there is a variation of files and images.
Have fun making use of it ;-)
本数据集收录了从起始至终了尼日利亚最常见语言的字母表,并可用于字符识别。该数据集通过物理方式录制,并已进行二值化处理,其中部分数据尚未二值化。数据集中包含50名学生的手写样本,涵盖了每种语言的大写和小写字母。数据集文件:此文件包含数据集的原始图像;因此,它是所有文件中最大的一个。二进制文件:此文件包含将原始数据转换为二进制格式的数据,阈值为210,因此它是所有文件中最小的。排序文件:此文件包含已排序的图像,即创建了包含所有字母“A”及其后的字母“B”直至“Z”的文件夹。因此,它与二进制文件有所不同。您只需下载您选择使用的文件,然后进行解压缩。调整尺寸的文件:此文件包含所有调整至特定尺寸的图像。由于数据集存在不同的贡献者,因此文件和图像存在一定的差异性。愿您在使用过程中乐在其中;-)
提供机构:
doi.org



