Word counts per US county in geo-tagged Tweets posted between 2015 and 2021
收藏DataCite Commons2023-03-31 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Word_counts_per_US_county_in_geo-tagged_Tweets_posted_between_2015_and_2021/20630919
下载链接
链接失效反馈官方服务:
资源简介:
The zip file contains fourteen Parquet [1] files of two kinds, for each of the seven years between 2015 and 2021 included: - region_counts: for every word found, gives how many times it appeared, regardless of capitalization ("count" column), how many times it appeared with at least one capitalized letter ("count_upper"), in how many different counties it appeared ("nr_cells"), and whether we considered it to be a proper noun ("is_proper") - raw_cell_counts: gives the count for every word by county, regardless of capitalization. <br> These counts were obtained from geo-tagged Tweets posted those years within the contiguous US, which were collected through the through the streaming API of Twitter, and more specifically using the “statuses/filter” end-point [2]. See the project's paper for more details on methodology, and the code repository to reproduce the analysis. <br> The two text files are our lists of excluded word forms.
提供机构:
figshare
创建时间:
2023-02-28



