five

Handwritten Text Recognition Test Set: Minutes of the Swiss Federal Council (1848-1903)

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4746341
下载链接
链接失效反馈
官方服务:
资源简介:
This data set is a test set generated to test the capabilities of engines for Optical Character Recognition and Handwritten Text Recognition. The data set consists of extracts of the minutes of the Swiss Federal Council. The single lines have been randomly chosen from about 150'000 pages of handwritten minutes. For each line, an image file is being provided by the Swiss Federal Archives/Schweizerisches Bundesarchiv [images.tar.gz]. Please cite the images as follows: Excerpts of BAR E1004.1#1000/9#1-215. The images are in the public domain. A PageXML file [page.zip] accompanies every image file and indicates the transcription and coordinates of the line. For PageXML see Pletschacher, S., & Antonacopoulos, A. (2010). The PAGE (Page Analysis and Ground-Truth Elements) Format Framework. 257–260. https://doi.org/10.1109/ICPR.2010.72.
创建时间:
2021-05-11
二维码
社区交流群
二维码
科研交流群
商业服务