Constructing Misinformation Datasets
收藏arXiv2022-02-25 更新2024-06-21 收录
下载链接:
https://github.com/USCMelady/Constructing-Misinformation-Datasets-WWW-2022
下载链接
链接失效反馈官方服务:
资源简介:
本数据集由南加州大学创建,专注于从社交媒体话语中构建大规模的错误信息标记数据集。数据集包含约14600条数据,主要来源于Twitter,涵盖了COVID-19疫苗相关的内容。创建过程中,研究者利用新闻源可信度标签作为弱标签,并通过模型引导的标签精炼方法来构建数据集。该数据集旨在解决社交媒体上错误信息的快速识别和分类问题,支持相关领域的研究和政策制定。
This dataset was created by the University of Southern California, focusing on constructing a large-scale misinformation-annotated dataset from social media discourse. It contains approximately 14,600 entries primarily sourced from Twitter, covering content related to COVID-19 vaccines. During the dataset construction process, researchers utilized news source credibility labels as weak labels and adopted model-guided label refinement methods to build the dataset. This dataset aims to address the rapid identification and classification of misinformation on social media, supporting research and policy-making in relevant fields.
提供机构:
南加州大学
创建时间:
2022-02-25



