dh-unibe/image-text_bullinger-autoren
收藏Hugging Face2026-04-26 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/dh-unibe/image-text_bullinger-autoren
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含8022个样本,分为1个分割(train)。数据集的地理范围为欧洲,时期覆盖1530年至160年,语言包括拉丁语和早期现代德语,文档类型为信件,来源为各种档案。数据集基于Transkribus PageXML数据,使用pagexml-hf转换器创建,适用于图像到文本任务,如手写文本识别(HTR)和TrOCR转录。数据集中包含多个项目,涉及多位历史人物(如Heinrich Bullinger等),每个样本包含图像、XML内容、文件名和项目名称等特征。
This dataset contains 8022 samples across 1 split (train). Geographical scope: Europe, Period: 1530-1600, Languages: Latin, Early Modern German, Type of document: Letters, Provenance: various archives. It was created using the pagexml-hf converter from Transkribus PageXML data and is suitable for image-to-text tasks such as Handwritten Text Recognition (HTR) and TrOCR transcription. The dataset includes multiple projects involving various historical figures (e.g., Heinrich Bullinger), with each sample containing features like image, xml_content, filename, and project_name.
提供机构:
dh-unibe



