five

Data for: Mapping the Dutch Vaccination Debate on Twitter: Identifying Communities, Narratives, and Interactions

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://data.mendeley.com/datasets/fjvk93bc5m
下载链接
链接失效反馈
官方服务:
资源简介:
*Tweets* We retrieved all Dutch Twitter messages (statuses or tweets) written between 07-28-17 and 12-02-17 that included the words: ‘vaccinatie’, ‘vaccineer’, ‘vaccineert’, ‘vaccineren’, ‘vaccineerde’, ‘vaccineerden’, ‘gevaccineerd’, ‘gevaccineerden’, ‘vaccin’, ‘vaccins’, ‘inenting’, or ‘inenten’ . This produced a collection of 2,869 tweets by 1,684 unique users. Many of these tweets resulted from (multiple) interactions between users. For example, 823 of our 2,869 original tweets (28.7%) were replies, 414 (14.4%) were retweets, and 249 (8.7%) were quotes. Many of these statuses would not have been written without an original tweet to retweet, quote, or reply to. As we wanted our data to reflect this context, we retrieved the (chains of) tweets that triggered the retweets, quotes, and replies in our initial set, resulting in 2,437 extra tweets by 1,197 unique users, of whom 324 unique users were present in our initial data set. This led to a sample set of 5,306 unique messages written by 2,557 unique users. *Nodelist and edgelist* Just a small section of all registered Twitter users actively tweet; many users merely lurk or are inactive [21,22]. However, connections between non-tweeting and tweeting users make up a large part of the digital infrastructure that facilitates the circulation of vaccine-related content and can be used to reveal the underlying social context. Therefore, for each of the unique Twitter accounts in our earlier-retrieved set of tweets (the authors), we retrieved all their followers (accounts following the authors: 34,135,154) and followees (accounts followed by the authors: 1,288,618). We were interested in identifying online communities based on shared interests (who the authors are following) and shared audiences (who the authors are followed by). We therefore excluded followers and followees who were not connected to at least 15 authors. We determined this cut-off point by examining the distribution of the number of connections with authors and arrived at our ultimate network size to stay within the limits of what our hardware and software were capable of handling in terms of visualization. Ultimately, our network included 121,623 Twitter accounts and 3,706,124 connections. We used the Louvain algorithm to detect communities in our network. This is known as a fast, but relatively accurate, method to detect communities in large-scale networks.
创建时间:
2019-04-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作