five

COVID-19相关推文数据集

收藏
arXiv2023-10-06 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2310.04237v1
下载链接
链接失效反馈
官方服务:
资源简介:
本研究使用了一个关于COVID-19的推文数据集,该数据集由6420条推文筛选并清洗后得到3049条有效数据,其中2161条标记为‘真实’,888条标记为‘虚假’。数据集主要来源于官方Twitter账号和事实核查网站,如PolitiFact、Poynter和Snopes。创建过程中,研究助理对每条推文进行了手动标记和分类,确保数据准确性。该数据集用于分析假新闻与真实新闻的语义特征,旨在通过语言分析提高社交媒体信息的真实性和可信度。

This study utilizes a COVID-19-focused tweet dataset. Initially consisting of 6,420 tweets, the dataset was filtered and cleaned to yield 3,049 valid entries, with 2,161 labeled as 'real' and 888 as 'fake'. The dataset is primarily sourced from official Twitter accounts and reputable fact-checking websites including PolitiFact, Poynter, and Snopes. During the dataset curation process, research assistants manually labeled and categorized each tweet to ensure data accuracy. This dataset is applied to analyze the semantic features of real and fake news, with the objective of enhancing the authenticity and credibility of social media information through linguistic analysis.
提供机构:
未提及
创建时间:
2023-10-06
二维码
社区交流群
二维码
科研交流群
商业服务