five

IEEHR 2017-The esposalles

收藏
帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-335.html
下载链接
链接失效反馈
官方服务:
资源简介:
The extraction of relevant information from historical handwritten document collections is one of the key steps in order to make these manuscripts available for access and searches. In this context, instead of a pure transcription, the objective is to move towards document understanding. Concretely,the aim is to detect the named entities and assign each of them a semantic category, such as family names, places, occupations, etc. A typical application scenario of named entity recognition is demographic documents, since they contain people's names,birthplaces, occupations, etc. In this scenario, the extraction of the key contents and its storage in databases allows the access to their contents and envision innovative services based in genealogical, social or demographic searches. Lately, the interest of the document image analysis community in document understanding, named entity recognition and semantic categorization is awaking, and some techniques based on HMMs, BLSTMs and CNNs have been proposed. With this competition, we aim to foster the research in this field an offer a benchmark for the research community. This database consists of historical handwritten marriages records from the Archives of the Cathedral of Barcelona. The pages we used correspond to the volume 69, written in old Catalan by one single writer in the 17th century. Each marriage record contains information about the husbands occupation, place of origin, husbands and wifes former marital status, parents occupation, place of residence, geographical origin, etc.
提供机构:
帕依提提
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作