HTR Model Spanish Gothic Incunabula (HSMS)
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14171447
下载链接
链接失效反馈官方服务:
资源简介:
The Spanish Gothic Incunabula (HSMS) is conceived to be uploaded inside Transkribus platform (READ Coop) to perform a training and create an PyLaia model for the automated recognition of Spanish incunabula in Gothic script printed between 1472 and 1500. It can be used for post-incunabula (up to 1520).The transcription model follows the rules set by the Hispanic Seminary of Medieval Studies in 1977 (newest version). The rules applied are:
Abbreviated words are expanded and the expanded text is enclosed between < >: q
Superscripted letters are followed by a grave accent: qi`en
All ç and ñ are transcribed as ç and ñ
No attemp has been made to normalize spacing
All punctuations signs are kept
All abbreviated nasals before b or p are transcribed as . It is up to the editors if they should be changed into m.
Abbreviated v' (tilde over v, or small slash v) that can be expanded as v or v is expanded as v. It is up to the editor if they should be changed to v.
Tironian et is transcribed as & (ampersand)
Pilcrows are transcribed as ¶
The model is built on 200 openings (verso-recto) drawn from 20 books printed by five different workshops form Sevile, Zaragoza, Burgos, Toledo and Pamplona. They Train Set consist 180152 words, distribuited over 24061 lines. The CER on the Train Set is 0.20 % and on the Validation Set 0.77 %.
创建时间:
2024-11-22



