fake-news-UFG/FakeNewsSet
收藏FakeNewsSet
数据集描述
- 许可证: MIT
- 任务类别: 文本分类
- 语言: 葡萄牙语
- 数据集大小: n<1K
- 语言详情: pt-BR
- 多语言性: 单语
- 语言创建者: 发现
引用信息
如果您使用 "FakeNewsSet",请引用:
bibtex @inproceedings{10.1145/3428658.3430965, author = {da Silva, Fl{a}vio Roberto Matias and Freire, Paulo M{a}rcio Souza and de Souza, Marcelo Pereira and de A. B. Plenamente, Gustavo and Goldschmidt, Ronaldo Ribeiro}, title = {FakeNewsSetGen: A Process to Build Datasets That Support Comparison Among Fake News Detection Methods}, year = {2020}, isbn = {9781450381963}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3428658.3430965}, doi = {10.1145/3428658.3430965}, abstract = {Due to easy access and low cost, social media online news consumption has increased significantly for the last decade. Despite their benefits, some social media allow anyone to post news with intense spreading power, which amplifies an old problem: the dissemination of Fake News. In the face of this scenario, several machine learning-based methods to automatically detect Fake News (MLFN) have been proposed. All of them require datasets to train and evaluate their detection models. Although recent MLFN were designed to consider data regarding the news propagation on social media, most of the few available datasets do not contain this kind of data. Hence, comparing the performances amid those recent MLFN and the others is restricted to a very limited number of datasets. Moreover, all existing datasets with propagation data do not contain news in Portuguese, which impairs the evaluation of the MLFN in this language. Thus, this work proposes FakeNewsSetGen, a process that builds Fake News datasets that contain news propagation data and support comparison amid the state-of-the-art MLFN. FakeNewsSetGens software engineering process was guided to include all kind of data required by the existing MLFN. In order to illustrate FakeNewsSetGens viability and adequacy, a case study was carried out. It encompassed the implementation of a FakeNewsSetGen prototype and the application of this prototype to create a dataset called FakeNewsSet, with news in Portuguese. Five MLFN with different kind of data requirements (two of them demanding news propagation data) were applied to FakeNewsSet and compared, demonstrating the potential use of both the proposed process and the created dataset.}, booktitle = {Proceedings of the Brazilian Symposium on Multimedia and the Web}, pages = {241–248}, numpages = {8}, keywords = {Fake News detection, Dataset building process, social media}, location = {S~{a}o Lu{i}s, Brazil}, series = {WebMedia 20} }



