five

Corpus of 19th century Spanish American novels for family resemblance analysis (part of data-nh)

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4449013
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains different formats of a corpus of 19th century Spanish American novels which were used in a family resemblance analysis as a part of the dissertation "Genre Analysis and Corpus Design: 19th Century Spanish American Novels (1830-1910)" by Ulrike Henny-Krahmer. The texts are included as plain text files, linguistically annotated files (using TreeTagger), as text files with only noun lemmas, and as chunks of 1,000 noun tokens derived from the lemmatized texts. The texts were prepared in this way to be used with topic modeling. Because 22 of the novels still are under copyright, this dataset has restricted access. The other 234 novels are in the open domain. This dataset is part of "data-nh" (see https://github.com/cligs/data-nh), which is the whole collection of research data accompanying the above-mentioned dissertation.
创建时间:
2021-01-19
二维码
社区交流群
二维码
科研交流群
商业服务