MPS Data set with images of medieval charters for handwriting-style based dating of manuscripts
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/1194356
下载链接
链接失效反馈官方服务:
资源简介:
The MPS benchmark data set for handwritten manuscript dating
____________________________________________________________
This data set is collected for the Dutch NWO project:
Medieval Paleographical Scale (MPS)
by Petros Samara
Project website: http://application02.target.rug.nl/monk/Projects/MPS/
Copyright (c) Huygensinstituut, Den Haag, 2016
University of Groningen, 2016.
All rights reserved.
Organisation of the data: Each .tar.gz file contains a number of NetPBM
images. The format is chosen because of its simplicity. Also,
there is no doubt about lossy compression in the processing chain. The file
names are of the format 'MPS_.ppm', for example, 'MPS1300_0056.ppm'.
Note: the files are not in a separate directory, they will be extracted in place.
However, due to the unique naming, there is no problem extracting them in one
single current (destination) directory.
The actual type of the image can be gray scale (.pgm) or color (.ppm),
in '8-bit DirectClass' according to ImageMagick's 'identify' tool.
The images were cropped out of larger photographs because of irrelevant
elements such as a Kodak color calibrator and non-text content such as supporting
surface (table) backgrounds, seals (emblems), ribbons, etc.
No effort has been made to obtain a balanced set of samples over years:
the given frequencies of occurrence in archives are used.
There is evidently less data in years before 1375 A.D. while some periods
provides us with ample data for historical reasons (e.g, 1450 A.D.). It
would have been a pity if the scarce years had determined and limited the
size of this data set. Selection criteria for data reduction, whether random
or systematic, would have been arbitrary. In any case, these images were
used in our publications, such that the performance results of
future attempts on manuscript dating can be compared with earlier results.
The performances that have been reached using our algorithms are in
the order of an MAE (mean average error) of 10 years.
If you have any questions, please contact us:
Sheng He (heshengxgd@gmail.com)
Petros Samara (petros.samara@huygens.knaw.nl)
Jan Burgers (jan.burgers@huygens.knaw.nl)
Lambert Schomaker (L.Schomaker@ai.rug.nl)
Please cite our papers if you use this data set:
[1] Sheng He, Petros Samara, Jan Burgers, Lambert Schomaker.
Image-based historical manuscript dating using contour and stroke fragments.
Pattern Recognition(PR), Vol. 59, pp. 159-171, 2016
[2] Sheng He, Petros Samara, Jan Burgers, Lambert Schomaker.
Towards style-based dating of historical documents.
International Conference on Frontiers in Handwriting Recognition(ICFHR), Crete, Greece, 2014
[3] Sheng He, Petros Samara, Jan Burgers, Lambert Schomaker.
Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization
IEEE Trans. on Image Processing, Vol. 25(11), Nov. 2016.
http://ieeexplore.ieee.org/document/7551181/
Data are collected thanks to Dutch NWO grant project 380-50-006
创建时间:
2020-01-24



