Corpus of 19th century Spanish American novels for family resemblance analysis (part of data-nh)

NIAID Data Ecosystem2026-03-12 收录

下载链接：

https://zenodo.org/record/4449013

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains different formats of a corpus of 19th century Spanish American novels which were used in a family resemblance analysis as a part of the dissertation "Genre Analysis and Corpus Design: 19th Century Spanish American Novels (1830-1910)" by Ulrike Henny-Krahmer. The texts are included as plain text files, linguistically annotated files (using TreeTagger), as text files with only noun lemmas, and as chunks of 1,000 noun tokens derived from the lemmatized texts. The texts were prepared in this way to be used with topic modeling. Because 22 of the novels still are under copyright, this dataset has restricted access. The other 234 novels are in the open domain. This dataset is part of "data-nh" (see https://github.com/cligs/data-nh), which is the whole collection of research data accompanying the above-mentioned dissertation.

创建时间：

2021-01-19