five

Farsi Twitter botnet 2021

收藏
科学数据银行2024-07-26 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=b93aa563200b4d64862f3e6d98ba61aa
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the directed networks related to the article "Bots Election: Unveiling the complex network of social botnets."From April 29, 2021, to June 24, 2021—7 weeks prior to the election—we identified 97 popular hashtags associated with the Iranian presidential election through daily tracking of trending hashtags. We developed a program called "Twitter Machine," which stores every tweet containing that word from the specified date in an SQLite database when given a keyword and a start date. The tweets can be accessed through the Twitter Standard Search API. Twitter Machine parses the JSON returned by the API, stores the necessary information in an SQLite database, and updates the user index in a PostgreSQL database. Every participant in a debate gathered by Twitter Machine is saved in a relational database called PostgreSQL. The number of tweets containing election hashtags was 8818675, and they were posted by 153115 unique users collected through Twitter’s data. To evaluate user authenticity, Botometer, a supervised machine-learning algorithm, is employed to gather botscores and Complete Automation Probability (CAP) for the users. The universal CAP, which represents the language-agnostic complete automation probability, is derived from Botometer results for each account and used as their respective botscores. In order to enhance the precision of the research, we manually annotated over 1000 accounts with the help of crowd-sourcing. Furthermore, we split the accounts into three categories based on their bot score: We labeled the top 30 percent of CAP scores(> 0.7) as ’bots,’ the bottom 30 percent(<= 0.3) as ’genuine users,’ and the rest as ’gray users.’ This classification returns 71.2% of accounts as bots, 27.6% as gray users, and 1.2% as genuine users. Our analysis focused on the friendship and retweet networks, which seemed more suitable for determining users’ social media influence. In the retweet network, each node represents a user who tweeted an election-related post with a hashtag. If user i retweets a post from user j, they are connected by a directed edge in the retweet network. In the context of a friendship network, when user i follows user j, they are also linked by a directed edge. We isolated the sub-graph of nodes with the same bot-score category within each network to facilitate a more detailed comparison of these categories and their internal interactions. Our subsequent focus will center on interactions between bots in both networks. In order to identify the supporters of each political figure, we used modularity classes to categorize nodes into different clusters(For this article, only the seven largest communities are examined, and smaller communities are excluded.) We then manually examined the tweets with the most retweets from the top 10 nodes in each community during the election period to determine the political preferences of these users. Based on their preferences, we labeled each cluster accordingly. Clusters promoting or advocating specific candidates or political stances (i.e., ’pro’ and ’anti’) are named after their communities.It's important to note that sensitive data has been omitted from this dataset due to the sensitive data policy.fl-bots: The friendship network of botsfl-grays: The friendship network of gray usersfl-users: The friendship network of genuine usersrt-bots: The retweet network of botsrt-grays: The retweet network of gray usersrt-users: The retweet network of genuine users
提供机构:
Moradi Parham; Bigdeli Parsa
创建时间:
2024-07-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作