five

Euronews XML corpus

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5524181
下载链接
链接失效反馈
官方服务:
资源简介:
The Euronews XML corpus comprises the transcription and XML encoding of handwritten newsletters, ranging between 1550 and 1730 and preserved today within the Florence State Archives. The manuscript newsletters, also called avvisi in Italian, are a Renaissance invention consisting of usually anonymous sheets, reproduced in multiple copies, which eventually became the basis of the first printed journalism. The Euronews project team built a methodology to encode this type of early modern informative source and to create a corpus usable for data analytics. The transcription and XML encoding guidelines are explained in detail at this page: https://github.com/lallori/euronews-xml-corpus/wiki/transcription-xml-encoding-guidelines The main language of the documents transcribed and encoded in the corpus is Italian (XVI-XVII century). The Euronews Project is funded by the Irish Research Council, through IRCLA/2019/41 and is hosted by University College Cork in collaboration with the Medici Archive Project.
创建时间:
2021-09-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作