five

HTRCatalogs: Dataset for historical catalogs HTR and Segmentation

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5458349
下载链接
链接失效反馈
官方服务:
资源简介:
This release contains 465 xml files, and their corresponding images from a large corpus of 19th, 20th and 21th exhibition catalogs, manuscripts'fair catalogs and directories. The new catalogs added here were created using the HTR and segmentation models accessible in the repository. It includes a csv file describing the xml files and various tools to create a training dataset: differents bash scripts, a python programm to divide the xml files into testing, training and evaluation dataset and several fixed tests. A xsl transformation sheet is also accessible to delete the Entry and EntryEnd zones from the xml files in order to have a SegmOnto-like dataset. The xml files has been corrected since the 4.0 release thanks to the addition of a github action (SegmOntoKraken).
创建时间:
2021-09-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作