People, Places, and Languages in the correspondence preserved in the archive of the International Institute of Intellectual Cooperation
收藏DataCite Commons2025-09-29 更新2024-07-13 收录
下载链接:
https://dataverse.csuc.cat/citation?persistentId=doi:10.34810/data985
下载链接
链接失效反馈官方服务:
资源简介:
The present dataset contains (meta) information extracted from the materials preserved in the archival funds of the International Institute of Intellectual Cooperation (IIIC), which was recently digitized [available at https://atom.archives.unesco.org/iiic ]. More precisely, the dataset focuses on subseries A and F from the Series Correspondence. Using machine learning and natural language processing (NLP) techniques, we have parsed scanned documents and extracted from them meta-information like: people and location mentions, language (e.g., French), nature of material (e.g., letter vs. attached document), formal aspects (e.g., handwritten vs. typewritten), and -- if possible -- its year of publication. Moreover, we have associated these entities (e.g., a given person) and information to the specific document(s) where they appear. We have divided the dataset in three files: one focused on people and two on locations (one for countries and another for cities). This dataset has been generated within the ERC-StG project named "Social Networks of the Past: Mapping Hispanic and Lusophone Literary Modernity, 1898-1959".
提供机构:
CORA.Repositori de Dades de Recerca
创建时间:
2024-04-11



