HORAE
收藏arXiv2020-12-01 更新2024-06-21 收录
下载链接:
https://github.com/oriflamms/HORAE/
下载链接
链接失效反馈官方服务:
资源简介:
HORAE数据集由法国国家科研中心创建,包含557张来自中世纪手抄本《时辰书》的注释页面。这些页面因其丰富的插图和多种宗教文本而成为研究欧洲宗教心态演变的重要资料。数据集通过精心挑选和手动注释,涵盖了多种页面类型和布局,旨在通过机器学习方法自动分析历史文档。该数据集适用于图像分割、神经网络训练等领域,特别关注中世纪手稿的布局分析和文本识别。
The HORAE dataset was developed by the French National Centre for Scientific Research (CNRS), and consists of 557 annotated pages from the medieval manuscript *Book of Hours*. These pages are important resources for studying the evolution of European religious mentality, thanks to their rich illustrations and varied religious texts. The dataset has been carefully selected and manually annotated, covering diverse page types and layouts, with the goal of enabling automatic analysis of historical documents through machine learning approaches. This dataset is suitable for applications including image segmentation, neural network training and other related fields, with a special emphasis on layout analysis and text recognition for medieval manuscripts.
提供机构:
法国国家科研中心
创建时间:
2020-12-01



