five

Prop-HiT

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10155423
下载链接
链接失效反馈
官方服务:
资源简介:
Prop-HiT Dataset Version 1.0 Version 1.0: November 18, 2023 About Prop-HiT is a Propaganda Dataset for Hindi Text. The Prop-HiT dataset includes 790 articles from 32 Hindi news websites. The dataset is manually annotated using the LightTag annotation tool considering 18 propaganda techniques as follows: 1. Appeal to authority 2. Appeal to fear/prejudice 3. Bandwagon 4. Black-and-white fallacy 5. Causal oversimplification 6. Doubt 7. Exaggeration/minimization 8. Flag-waving 9. Loaded Language 10. Name Calling or Labelling 11. Obfuscation, intentional vagueness, confusion 12. Red herring 13. Reductio ad Hitlerum 14. Repetition 15. Slogans 16. Straw man 17. Thought-terminating cliche 18. Whataboutism Data format The dataset consists of one plain text and one tab-separated file per article. The text file contains the contents of the article. The tsv file contains one propaganda technique per line with the following information: article_id, technique, begin_offset, and end_offset The naming convention for the files is as follows: - article[unique_id].txt for the plain-text file   - article[unique_id].labels.tsv for the annotations files   There are two subfolders as train with 550 articles and test with 240 articles. Credit Please cite the dataset as: [Prop-HiT] Deptii Chaudhari, Dr. Ambika Pawar. 2023. Prop-HiT: Propaganda Dataset for Hindi Text. https://doi.org/10.5281/zenodo.10155424   Authors Deptii Chaudhari; Dr. Ambika Pawar
创建时间:
2024-07-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作