Deleted Wikipedia articles (spam/vandalism/attack)
收藏DataCite Commons2020-09-03 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/Deleted_Wikipedia_articles_spam_vandalism_attack_/4245035/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains a random sample of deleted articles from English Wikipedia were the reason was explicitly either spam, vandalism, or attack. 25 articles were sampled for each deletion reason for a total of 75 articles. Text of the articles was censored to remove identifying information. <br>The dataset contains the following columns:<br> · page_title -- The title of the deleted page · rev_id -- The rev_id of the first revision · creation_timestamp -- The time that the page was created · archived -- 1 if the page was deleted, 0 if not (always 1) · draft_quality -- The deletion reason (spam|vandalism|attack) · censored_text -- The censored text of the deleted page<br>Censored blocks are noted with a comment block in the censored_text column of the form "Censored: [reason]([explanation])" -- e.g. "Censored: PII(phone number)". <br><br>
提供机构:
figshare
创建时间:
2016-11-21



