five

Ground Truth Set for Handwritten Text Recognition (HTR/OCR): Dresdner Hofdiarium 1665 (Mscr.Dresd.K.80) - 17th century Kurrent manuscript

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14356189
下载链接
链接失效反馈
官方服务:
资源简介:
Ten pages (fol. 85r-89v) of Ground Truth from the "Hofdiarium des Kurfürsten Johann Georgs II. 1665" (SLUB Mscr.Dresd.K.80; https://www.wikidata.org/wiki/Q131379540). The handwriting is a typical late 17th century Saxon kurrent ("Kanzleikurrent"), with occasional words written in bastarda or fraktur-like script. Images and text have been aligned using eScriptorium with the default segmentation modell (blla.mlmodel) and manual correction.  ALTO and Page XML are provided as well as a txt of the full transcription and the jpgs of the original source in appropriately named folders. Scans have been provided by the SLUB Dresden under the Public Domain Mark: https://katalog.slub-dresden.de/id/0-1858746256  Transcription guidelines are oriented on the DTABF-M schema (https://www.deutschestextarchiv.de/doku/basisformat/manuskript.html), but have been adapted as follows: - I and J majuscules are not distinguished- u and v are reproduced true to the original (e.g. vnd)- Long-s (ſ) and round-s (s) are distinguished- sz ligature is rendered as ß in Kurrent scripts and as sz (e.g. "Libusza") in Antiqua scripts- ij ligature is rendered as y- other ligatures, if they occur at all, are dissolved- r graphemes are rendered as r in their modern day form- an m with a nasal stroke was rendered as a simple m- Where possible, abbreviation signs (Abbrechungszeichen) for the contemporary identification of abbreviations have been included as single letters and not marked separately. The subsequent punctuation mark (“.” or “:”) for further identification of the abbreviation has also been included (cf. also Capelli, 1928, Lexicon abbreviaturarum I, p.X) - Diacritics in u were not marked- In the case of uncertain capitalization, an approximation is sought via the letter size The dataset can be used under the CC BY-NC-SA 4.0 License.  This transcription is part of a larger project regarding the Dresden court diaries. Check https://slub-dresden.academia.edu/StefanBeckert for further updates.
创建时间:
2024-12-10
二维码
社区交流群
二维码
科研交流群
商业服务