Time shifting word2vec models from Times
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/1494140
下载链接
链接失效反馈官方服务:
资源简介:
Time-shifting word2vec models based on Times news paper. These models were generated using the "Generate time shifting models" scripts found here.
In summary, these scripts generate a collection of sentences for every 2 years period, and trains a word2vec model on this period using gensim. The original text from the Times news paper articles is processed as follows:
Articles are divided into sentences using punctuation.
Punctuation symbols are removed.
Text is converted to lower case.
Word are validated to ensure they are valid English non-stop words (using nltk).
The two year time period was selected following the Measure convergence for a range described here.
This data publication was made possible thanks to collaboration with the Utrecht Digital Humanities Lab.
Unfortunately original Times data set is not publicly available.
创建时间:
2020-01-24



