Metadata and Modeling Outputs for study "The Quiet Transformations of Literary Studies". Data

Name: Metadata and Modeling Outputs for study "The Quiet Transformations of Literary Studies". Data
Creator: No Publisher Supplied
Published: 2026-01-07 20:11:05
License: 暂无描述

DataCite Commons2026-01-07 更新2024-07-13 收录

下载链接：

https://rucore.libraries.rutgers.edu/rutgers-lib/44747

下载链接

链接失效反馈

官方服务：

资源简介：

New analytical approaches, like topic modeling, can illuminate subtle transformations, revealing concepts, frequently taken for granted, to be more variable than scholars have assumed. In this study, the corpus that was modeled included 21,367 JSTOR articles and 13,221 distinct author names resulting in the 150-topic model. The four files supporting this study and available here are: 1) vocab.txt: UTF-8 text, one word per line, giving all 98835 word types included in the model. The list of stop words excluded from this vocabulary is given at https://www.ideals.illinois.edu/handle/2142/45709, 2) id_map.txt: UTF-8 text, one string per line, giving JSTOR ID strings of all 23167 documents included in the model, in the order indexed by the sampling state file, 3) mallet_state.gz (370MB): gzip'd UTF-8 text representing the final sampling state output by MALLET. Each token of the input documents is represented by a single line, with six white-space delimited fields: document index, document label (unused), token index, word type index, word type as a string, topic index. The word type index is zero-based and corresponds to the order in vocab.txt. The document index is zero-based and corresponds to the order in id_map.txt, and 4) metadata.tar.gz (3.9MB): gzip'd tar archive of 8 CSV files containing metadata for the documents modeled. Metadata for documents in the model can be located by matching the "id" column to the IDs given in id_map.txt.

提供机构：

No Publisher Supplied

创建时间：

2014-09-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集