Newly Emerged Rumors in Twitter
收藏Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://zenodo.org/record/2563864
下载链接
链接失效反馈官方服务:
资源简介:
*** Newly Emerged Rumors in Twitter *** These 12 datasets are the results of an empirical study on the spreading process of newly emerged rumors in Twitter. Newly emerged rumors are those rumors whose rise and fall happen in a short period of time, in contrast to the long standing rumors. Particularly, we have focused on those newly emerged rumors which have given rise to an anti-rumor spreading simultaneously against them. The story of each rumor is as follow : 1- Dataset_R1 : The National Football League team in Washington D.C. changed its name to Redhawks. 2- Dataset_R2 : A Muslim waitress refused to seat a church group at a restaurant, claiming "religious freedom" allowed her to do so. 3- Dataset_R3 : Facebook CEO Mark Zuckerberg bought a "super-yacht" for $150 million. 4- Dataset_R4 : Actor Denzel Washington said electing President Trump saved the U.S. from becoming an "Orwellian police state." 5- Dataset_R5 : Joy Behar of "The View" sent a crass tweet about a fatal fire in Trump Tower. 6- Dataset_R6 : Harley-Davidson's chief executive officer Matthew Levatich called President Trump "a moron." 7- Dataset_R7 : The animated children's program 'VeggieTales' introduced a cannabis character in August 2018. 8- Dataset_R8 : Michael Jordan resigned from the board at Nike and took his Air Jordan line of apparel with him. 9- Dataset_R9 : In September 2018, the University of Alabama football program ended its uniform contract with Nike, in response to Nike's endorsement deal with Colin Kaepernick. 10- Dataset_R10 : During confirmation hearings for Supreme Court nominee Brett Kavanaugh, congressional Democrats demanded that the nominee undergo DNA testing to prove he is not Adolf Hitler. 11- Dataset_R11 : Singer Michael Bublé's upcoming album will be his last, as he is retiring from making music.Singer Michael Bublé's upcoming album will be his last, as he is retiring from making music. 12- Dataset_R12 : A screenshot from MyLife.com confirms that mail bomb suspect Cesar Sayoc was registered as a Democrat. The structure of excel files for each dataset is as follow : - Each row belongs to one captured tweet/retweet related to the rumor, and each column of the dataset presents a specific information about the tweet/retweet. These columns from left to right present the following information about the tweet/retweet : - User ID (user who has posted the current tweet/retweet) - The description sentence in the profile of the user who has published the tweet/retweet - The number of published tweet/retweet by the user at the time of posting the current tweet/retweet - Date and time of creation of the the account by which the current tweet/retweet has been posted - Language of the tweet/retweet - Number of followers - Number of followings (friends) - Date and time of posting the current tweet/retweet - Number of like (favorite) the current tweet had been acquired before crawling it - Number of times the current tweet had been retweeted before crawling it - Is there any other tweet inside of the current tweet/retweet (for example this happens when the current tweet is a quote or reply or retweet) - The source (OS) of device by which the current tweet/retweet was posted - Tweet/Retweet ID - Retweet ID (if the post is a retweet then this feature gives the ID of the tweet that is retweeted by the current post) - Quote ID (if the post is a quote then this feature gives the ID of the tweet that is quoted by the current post) - Reply ID (if the post is a reply then this feature gives the ID of the tweet that is replied by the current post) - Frequency of tweet occurrences which means the number of times the current tweet is repeated in the dataset (for example the number of times that a tweet exists in the dataset in the form of retweet posted by others) - State of the tweet which can be one of the following forms (achieved by an agreement between the annotators) : r : The tweet/retweet is a rumor post a : The tweet/retweet is an anti-rumor post q : The tweet/retweet is a question about the rumor, however neither confirm nor deny it n : The tweet/retweet is not related to the rumor (even though it contains the queries related to the rumor, but does not refer to the rumor)
### Twitter平台新兴谣言数据集
本数据集包含12个子集,均源自针对Twitter平台新兴谣言传播过程的实证研究。新兴谣言指短时间内完成兴起与消散的谣言,区别于长期流传的旧有谣言。本研究特别聚焦于伴随同步反谣言传播的新兴谣言。各谣言数据集详情如下:
1. Dataset_R1:华盛顿哥伦比亚特区国家橄榄球联盟球队更名为红鹰队
2. Dataset_R2:一名穆斯林女服务员以"宗教自由"为由拒绝为教会团体安排餐厅座位
3. Dataset_R3:脸书(Facebook)首席执行官马克·扎克伯格(Mark Zuckerberg)以1.5亿美元购置超级游艇
4. Dataset_R4:演员丹泽尔·华盛顿(Denzel Washington)称选举特朗普为总统使美国免于沦为"奥威尔式警权国家"
5. Dataset_R5:《观点》(The View)栏目主持人乔伊·比哈(Joy Behar)发布关于特朗普大厦致命火灾的粗鄙推文
6. Dataset_R6:哈雷戴维森(Harley-Davidson)首席执行官马修·莱瓦蒂奇(Matthew Levatich)称特朗普总统为"蠢货"
7. Dataset_R7:2018年8月,儿童动画节目《蔬菜总动员》(VeggieTales)引入大麻相关角色
8. Dataset_R8:迈克尔·乔丹(Michael Jordan)辞去耐克(Nike)董事会职务,并带走其Air Jordan服装产品线
9. Dataset_R9:2018年9月,阿拉巴马大学橄榄球队终止与耐克的球衣赞助合同,以抗议耐克与科林·卡佩尼克(Colin Kaepernick)的代言合作
10. Dataset_R10:在最高法院大法官提名布雷特·卡瓦诺(Brett Kavanaugh)的确认听证会上,国会民主党议员要求对该提名人进行DNA测试,以证明其并非阿道夫·希特勒(Adolf Hitler)
11. Dataset_R11:歌手迈克尔·布雷(Michael Bublé)的即将发行专辑将为其最后一张作品,他将自此退役不再制作音乐
12. Dataset_R12:MyLife.com的截图显示,邮件炸弹嫌疑人塞萨尔·赛奥克(Cesar Sayoc)登记为民主党成员
各数据集均以Excel文件存储,其结构规范如下:
- 每一行对应一条与目标谣言相关的捕获推文/转推,数据集的每一列代表该推文/转推的一项具体信息,各列从左至右依次为:
1. 用户ID(发布当前推文/转推的账号)
2. 发布账号的个人简介文本
3. 发布当前推文/转推时,该账号已发布的推文/转推总数
4. 当前推文/转推发布账号的创建日期与时间
5. 当前推文/转推的语言类型
6. 账号粉丝数
7. 账号关注数(好友数)
8. 当前推文/转推的发布日期与时间
9. 爬取该推文/转推前获得的点赞(收藏)数
10. 爬取该推文/转推前获得的转推次数
11. 当前推文/转推是否包含其他推文内容(例如当该推文为引用推文、回复推文或转推推文时)
12. 发布当前推文/转推所使用的设备操作系统来源
13. 推文/转推ID
14. 转推ID(若该内容为转推,则此字段记录被转推推文的ID)
15. 引用推文ID(若该内容为引用推文,则此字段记录被引用推文的ID)
16. 回复推文ID(若该内容为回复推文,则此字段记录被回复推文的ID)
17. 推文重复频次:指该推文在数据集中被重复出现的次数(例如其他用户以转推形式发布该推文的总次数)
18. 推文标注状态(经标注人员协商一致确定,可选值如下):
- r:当前推文/转推为谣言传播帖
- a:当前推文/转推为反谣言传播帖
- q:当前推文/转推为针对该谣言的疑问帖,未证实或否认该谣言
- n:当前推文/转推与该谣言无关(即使包含与谣言相关的查询,也未指向该谣言本身)
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集研究了Twitter上新兴谣言的传播过程,包含12个具体谣言的推文数据,每个数据集详细记录了推文信息和传播状态,适用于分析社交媒体上的谣言传播动态。
以上内容由遇见数据集搜集并总结生成



