five

Rawat - Sentiment Analysis of Tweets

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://data.mendeley.com/datasets/zp77hffzxg
下载链接
链接失效反馈
官方服务:
资源简介:
The extraction of data from the Twitter site was the initial step, without which no analysis is possible. Using the ‘Advanced Twitter search’ option, appropriate hashtags and dates were used to check if the Tweets from the desired dates are available. To scrape the Tweets and fetch the historical data, a Twitter framework was created in ‘Octoparse’ software. The final output of tweets was downloaded in Excel format. Nearly 234 tweets were obtained using the hashtags ‘#BipinRawat’ ‘#Karma’ ‘#IAFChoppercrash’ and ‘#IndianAirForce.’ Tweets in regional language; news and tweets of different contexts but similar hashtags; updates from online news channels; and retweets, or replies were filtered. To annotate the reviews manually the guidelines were framed following the design proposed by Mohammad (2016) in their manual, ‘A Practical Guide to Sentiment Annotation: Challenges and Solutions’. The questionnaire was also prepared based on the same.

从推特(Twitter)平台提取数据是本数据集构建的首要步骤,缺失该环节则无法完成后续的数据处理与分析。本次构建工作借助「推特高级搜索」功能,通过设定适配的话题标签(hashtag)与时间范围,验证目标时段的推文是否可被正常获取。为抓取推文并获取历史数据,我们在「Octoparse」软件中搭建了专属的推特数据爬取框架。最终采集的推文数据以Excel格式导出,通过话题标签`#BipinRawat`、`#Karma`、`#IAFChoppercrash`及`#IndianAirForce`,共收集到约234条推文。随后对采集所得数据进行多维度筛选,剔除了小语种推文、语境各异但话题标签相似的新闻与推文、在线新闻渠道的更新内容,以及转发、回复类冗余数据。为开展人工标注工作,本构建工作参考Mohammad(2016)在其专著《情感标注实用指南:挑战与解决方案》(*A Practical Guide to Sentiment Annotation: Challenges and Solutions*)中提出的标注框架制定了人工标注指南,并据此设计了配套的调查问卷。
创建时间:
2023-12-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作