RA-Data-Science/DiEm_HTR
收藏Hugging Face2025-11-11 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/RA-Data-Science/DiEm_HTR
下载链接
链接失效反馈官方服务:
资源简介:
DiEm HTR 数据集是一个关于17世纪和18世纪丹麦历史手写文字的真实数据集,作为丹麦国家档案馆的“数字化单一部长办公室书籍”项目的一部分。该数据集由975张已转录的图像组成,包含总共67410行文本和383339个单词。数据集包括以下特征:图像、文档ID、序列、ALTO XML和PAGE XML。该数据集主要用于训练17世纪和18世纪丹麦手写文字的手写文字识别(HTR)模型。
The DiEm HTR dataset is a ground truth dataset for historical Danish handwriting from the 17th and 18th century, generated as part of the Digitalisering af Enesteministerialbøger project at the Danish National Archives. The dataset consists of 975 transcribed images, containing a total of 67410 text lines and 383339 words. The dataset includes features such as images, document IDs, sequences, ALTO XML, and PAGE XML. It is primarily used for training Handwritten Text Recognition (HTR) models for Danish handwriting from the 17th and 18th centuries.
提供机构:
RA-Data-Science



