five

Replication Data for: A Timely Intervention: Tracking the Changing Meanings of Political Concepts with Word Vectors

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://doi.org/10.7910/DVN/CGNX3M
下载链接
链接失效反馈
官方服务:
资源简介:
Word vectorization is an emerging text-as-data method that shows great promise for automating the analysis of semantics – here, the cultural meanings of words – in large volumes of text. Yet successes with this method have largely been confined to massive corpora where the meanings of words are presumed to be fixed. In political science applications, however, many corpora are comparatively small and many interesting questions hinge on the recognition that meaning changes over time. Together, these two facts raise vexing methodological challenges. Can word vectors trace the changing cultural meanings of words in typical small corpora use cases? I test four time-sensitive implementations of word vectors (word2vec) against a gold standard developed from a modest dataset of 161 years of newspaper coverage. I find that one implementation method clearly outperforms the others in matching human assessments of how public dialogues around equality in America have changed over time. In addition, I suggest best practices for using word2vec to study small corpora for time series questions, including bootstrap resampling of documents and pre-training of vectors. I close by showing that word2vec allows granular analysis of the changing meaning of words, an advance over other common text-as-data methods for semantic research questions.
创建时间:
2019-04-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作