five

Labeled Datasets for Research on Information Operations

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14141549
下载链接
链接失效反馈
官方服务:
资源简介:
Labeled Datasets for Research on Information Operations Compliance with Platform Terms To comply with the platform terms, we ask that you download one data file per researcher, per day. README 19-November-2024Contact: Observatory on Social Media Dataset ArticlesThis dataset is collected and processed according to the paper "Labeled Datasets for Research on Information Operations." DescriptionThese datasets contain data curated for research on information operations (IO) and includes both labeled IO and control data. The datasets cover 26 verified IO campaigns from various countries and provide comprehensive records of posts from IO accounts alongside control posts from legitimate accounts discussing similar topics during the same periods. The datasets enable the development and benchmarking of IO detection methods by comparing coordinated versus organic accounts. LicenseThis dataset is available under the Attribution-NonCommercial-NoDerivatives 4.0 International license. If you use this data, please cite the original paper. Dataset ContentThe dataset includes anonymized fields to preserve privacy, and is structured with the following columns: postid: Unique identifier for each post within the dataset. post_text: The textual content of the post. The PII inside post_text such as mentions and URLs are hashed application_name: Hashed version of the name of the application or platform from which the post was made. post_language: Language in which the post was written. in_reply_to_postid: Anonymized ID of the post this entry is replying to, if applicable. in_reply_to_accountid: Anonymized ID of the account the post is replying to, if applicable. post_time: Timestamp indicating when the post was made. accountid: Unique anonymized ID for the account that created the post. account_profile_description: Description provided by the account holder in their profile. follower_count: Number of followers the account had at the time of data collection. following_count: Number of accounts the user was following at the time of data collection. account_creation_date: Date when the account was created. is_repost: Boolean indicator if the post is a repost. reposted_accountid: Anonymized ID of the original account that made the reposted post, if applicable. reposted_postid: Anonymized ID of the original post that was reposted, if applicable. hashtags: Hashtags included in the post content, if any. urls: Hashed URLs shared within the post, if any. account_mentions: Anonymized ID of accounts mentioned within the post, if any. is_control: Boolean indicator marking whether the post is from a control (True) or IO (False) account. Data for different campaigns are organized in separate versions of this repository.
创建时间:
2024-11-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作