German Character Recognition Dataset
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/records/8364967
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains 282,472 grayscale images, each measuring 40 x 40 pixels, depicting a diverse range of 82 distinct German characters, digits and mathematical symbols.
In contrast to the MNIST dataset, where image alignment varies, all the images in this dataset are perfectly aligned. They are centered within a 40 x 40 bounding box, ensuring they touch either the left and right sides or the top and bottom borders. This alignment significantly simplifies the training task, leading to excellent performance metrics.
The training and testing data is stored in two separate CSV files. In each file, the first column represents the Unicode character, while the subsequent 1600 values correspond to the grayscale values of the flattened image. If you find any aspect unclear, please refer to our attached code, which offers a comprehensive logic for training a CNN in PyTorch. You can easily select the specific classes on which you intend to train. Notably, when exclusively training on the digits from 0 to 9, we achieved an impressive accuracy and Matthews Correlation Coefficient (MCC) of roughly 99% on the test data.
创建时间:
2023-09-21



