veranchos/arg_mining_tweets
收藏COVID-19相关推文的立场和前提分类数据集
数据集描述
该数据集用于SMM4H22任务2:关于健康指令(COVID-19)的推文立场和前提分类。
数据内容
- 训练和测试数据:
- 训练数据:用于SMM4H 2022任务2的推文,标注了关于COVID-19指令(如居家令、学校关闭和口罩令)的立场和前提预测。
- 测试数据:包含2070条关于疫苗指令的标注推文,这些推文未在官方SMM4H竞赛中使用。
- 额外数据:包含600条关于疫苗指令的标注推文,这些推文因低注释者间一致性而未使用。
引用
如果您发现此数据集有用,请引用:
@inproceedings{davydova-tutubalina-2022-smm4h, title = "{SMM}4{H} 2022 Task 2: Dataset for stance and premise detection in tweets about health mandates related to {COVID}-19", author = "Davydova, Vera and Tutubalina, Elena", booktitle = "Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop {&} Shared Task", month = oct, year = "2022", address = "Gyeongju, Republic of Korea", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.smm4h-1.53", pages = "216--220", abstract = "This paper is an organizers{} report of the competition on argument mining systems dealing with English tweets about COVID-19 health mandates. This competition was held within the framework of the SMM4H 2022 shared tasks. During the competition, the participants were offered two subtasks: stance detection and premise classification. We present a manually annotated corpus containing 6,156 short posts from Twitter on three topics related to the COVID-19 pandemic: school closures, stay-at-home orders, and wearing masks. We hope the prepared dataset will support further research on argument mining in the health field.", }



