five

German Character Recognition Dataset

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/records/8364967
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains 282,472 grayscale images, each measuring 40 x 40 pixels, depicting a diverse range of 82 distinct German characters, digits and mathematical symbols. In contrast to the MNIST dataset, where image alignment varies, all the images in this dataset are perfectly aligned. They are centered within a 40 x 40 bounding box, ensuring they touch either the left and right sides or the top and bottom borders. This alignment significantly simplifies the training task, leading to excellent performance metrics. The training and testing data is stored in two separate CSV files. In each file, the first column represents the Unicode character, while the subsequent 1600 values correspond to the grayscale values of the flattened image. If you find any aspect unclear, please refer to our attached code, which offers a comprehensive logic for training a CNN in PyTorch. You can easily select the specific classes on which you intend to train. Notably, when exclusively training on the digits from 0 to 9, we achieved an impressive accuracy and Matthews Correlation Coefficient (MCC) of roughly 99% on the test data.
创建时间:
2023-09-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作