Features for the classification of Spanish American 19th century novels by subgenre (part of data-nh)
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4449493
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains feature sets that were prepared for the classification of Spanish American 19th century novels by subgenre. There are two main types of features sets: MFW-based features, and topic features. The MFW-based features include basic MFW, word n-grams, and character n-grams with different numbers of MFW and tf, tf-idf, or z-score normalization. The topic features are derived from topic models created with different parameters (number of topics, optimization intervals).
These feature sets were used in a classification analysis as a part of the dissertation "Genre Analysis and Corpus Design: 19th Century Spanish American Novels (1830-1910)" by Ulrike Henny-Krahmer. The dataset is part of "data-nh" (see https://github.com/cligs/data-nh), which is the whole collection of research data accompanying the above-mentioned dissertation.
创建时间:
2021-01-20



