five

Indonesian COVID-19 Vaccination-related Tweets for Stance Detection and Aspect-based Sentiment Analysis

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://data.mendeley.com/datasets/7ky2jbjwtn
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset was collected using Twitter API services for specific keywords posted for ten months, starting from January 2021 to October 2021. The data has been filtered for non-Bahasa (Indonesia language), non-target-related, spam, and duplication. There are two labeling processes: stance and aspect-based sentiment. Three annotators manually labeled the sample data and used the majority vote strategy for the final class label. In our annotation strategy, for stance labeling, each annotator was asked to annotate the individual tweets as "Favor", "Against", or "Neutral" for COVID-19 vaccination programs in Indonesia. While for aspect-based sentiment labeling, each tweet has been annotated into seven predetermined aspects of the COVID-19 vaccination, namely "Services", "Implementation", "Apps", "Costs", "Participants", "Vaccine-products", and "General". Each predetermined aspect will have two possible sentiment values, between "Positive" and "Negative".

本数据集通过Twitter API服务采集,采集周期为2021年1月至2021年10月,共计10个月,采集目标为包含特定关键词的推文。数据已针对非印尼语(Bahasa (Indonesia language))、与目标主题无关内容、垃圾信息及重复推文完成过滤。本数据集包含两类标注任务:立场标注(stance)与基于方面的情感标注(aspect-based sentiment)。共邀请三名标注人员对样本数据开展人工标注,并采用多数投票策略确定最终类别标签。在立场标注环节,要求每位标注人员针对印尼新冠疫苗接种计划,将单条推文标注为「支持(Favor)」、「反对(Against)」或「中立(Neutral)」三类。而在基于方面的情感标注环节,每条推文需被标注至7个预设的新冠疫苗接种相关方面,分别为「服务(Services)」、「实施(Implementation)」、「应用程序(Apps)」、「费用(Costs)」、「接种人群(Participants)」、「疫苗产品(Vaccine-products)」及「总体情况(General)」。每个预设方面仅可对应两种情感取值,即「积极(Positive)」与「消极(Negative)」。
创建时间:
2022-08-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作