Annotated Fake News Dataset in Urdu and Augmentation using Machine Translation
收藏数据集概述
数据集名称
Annotated Fake News Dataset in Urdu and Augmentation using Machine Translation
发布日期
March 03, 2020
作者
Maaz Amjad, Grigori Sidorov, Alisa Zhila
机构
Natural Language and Text Processing Laboratory, Center for Computing Research (CIC), Instituto Politécnico Nacional (IPN), Ciudad de México (Mexico City), Mexico
数据集内容
- 原始数据集包含900篇乌尔都语新闻文章,标注为真实或虚假。
- 增强数据集包含400篇新闻文章,通过Google Translate机器翻译系统从英语翻译至乌尔都语。
- 提供了多种数据集组合,用于探索增强效果。
相关论文
Amjad, M., Sidorov, G., Zhila, A. Data Augmentation using Machine Translation for Fake News Detection in the Urdu Language (2020), LREC 2020 (accepted).
引用信息
@article{Maazaug2020, author = {Maaz Amjad, Grigori Sidorov, Alisa Zhila}, title = {Annotated Fake News Dataset in Urdu and Augmentation using Machine Translation}, conference = {Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)}, page = {2530–2535} year = {2020} }




