微博客PCU数据集 数据集被用于探索微博中的垃圾邮件发送者
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26148.html
下载链接
链接失效反馈官方服务:
资源简介:
Jun Liu(liukeen '@' mail.xjtu.cn), Hao Chen(lechenhao '@' gmail.com) , Mengting Zhan, Jianhong Mi,Yanzhang Lv MOEKLINNS Lab, Department of Computer Science ,Xi'an Jiaotong University, China Data Set Information: Our dataset is used by us to explore spammers in microblog and you can access our demo system at [Web link] Please add :8080 after the domain name as port. The repository webpage fails to parse the weblink when it's added in the source. (under inspection) Attribute Information: weibo_user.csv has the following attributes: -user_id: account ID in sina weibo; -user_name: account nickname??? -gender:account registration gender including male??? female and other??? -class:account level given by sina weibo; -message:account registration location or other personal information; -post_num: the number of posts of this account up to now; -follower_num: the number of followers of this account; -followee_num: the number of followee of this account; -follow ratio: followee_num/follower_num; -is_spammer: manually annotated label, 1 means spammer and -1 means non-spammer; user_post.csv has the following attributes: -post_id:user post ID given by sina weibo; -post_time:the time when a post is posted; -poster_id: the user ID who posted this post; -repost_num:the number of retweet by others; -commnet_num: the number of comment by others; followe-followee.csv has the following attributes: -follower: the nickname of follower; -follower_id: the user ID of follower; -followee: the nickname of followee; -followee_id: the user ID of followee; post.csv is almost the as user_post.csv and the post in it are retrievalled by a certain key word related to a topic; -content: the post text(mostly in Chinese, please set your Microsoft Office to make it readable) Relevant Papers: N/A Citation Request: Thanks to MOEKLINNS Lab[[Web link]] especially Spammer Detection Group for opening its data
提供机构:
帕依提提



