five

Dataset for "The Asian American Literature We've Constructed"

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://doi.org/10.7910/DVN/O0RXGX
下载链接
链接失效反馈
官方服务:
资源简介:
"Text Author Scholarship Metadata.tab" includes all the metadata on primary text titles, publication years, authorial gender, authorial race and ethnicity, and scholarship year that we collected and used to derive the results on contemporaneity, gender balance, and ethnic inequalities. Some of the metadata we used was proprietary to the MLA Bibliography so we cannot share more information on each piece of scholarship that cites an Asian American primary text. We have included accession numbers that will take you to the relevant record in the MLA bibliography (and DOIs for scholarship from Amerasia journal, which is not indexed in the MLA bibliography). "Chinese and Filipinx ethnic specific and panethnic citations.tab" includes all the metadata we used to calculate the results presented in figures 6 and 7. The citation counts under the panethnic label were derived from the metadata in "Text Author Scholarship Metadata.tab". The citation counts under the ethnic specific labels were collected from the MLA bibliography through searches for "Chinese American" "Filipino American/Filipino/Filipina" in the titles and abstracts of scholarly works. The topic modeling results in the article were based on a corpus accessed through the HathiTrust Research Center, with about 100 additional texts we digitized ourselves since they are not available in Hathi. "Topic modeling corpus composition.tab" shows the texts in that corpus and their HathiTrust IDs if the text was from Hathi. (Note that the corpus includes some cited pieces that are part of larger collections—a short story in a story collection, for instance. The Hathi IDs listed for such works are IDs for the whole collection. We cut down such texts to just the piece cited before topic modeling them. There are also instances where both a piece from a collection and the whole collection were cited. In those instances, we included both the whole collection and the piece in our corpus.) "Topics, top words, ethnic coding.tab" shows the topics generated from this corpus when we ran MALLET, the top 50 words in each topic, and how we coded each topic for ethnic affiliation. "Topic percentages in chunked texts.tab" shows the proportional makeup measure MALLET attributed to each topic for each 1000-word text chunk in the corpus. We averaged the proportional makeup percentages for ethnically affiliated topics across all the chunks of a text and then weighted these results by the number of times the text has been cited in Asian Americanist scholarship. Those results are presented in "Percentages of ethnically coded topics in whole texts weighted by citations.tab".
创建时间:
2021-04-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作