five

Yahoo-Yahoo Hash-Tag Tweets Using Sentiment Analysis and Opinion Mining Algorithms

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/4748716
下载链接
链接失效反馈
官方服务:
资源简介:
Background Social media opinion has become a medium to quickly access large, valuable, and rich details of information on any subject matter within a short period. Twitter being a social microblog site, generate over 330 million tweets monthly across different countries. Analysing trending topics on Twitter presents opportunities to extract meaningful insight into different opinions on various issues. Aim This study aims to gain insights into the trending yahoo-yahoo topic on Twitter using content analysis of selected historical tweets. Methodology The widgets and workflow engine in the Orange Data mining toolbox were employed for all the text mining tasks. 5500 tweets were collected from Twitter using the “yahoo yahoo” hashtag. The corpus was pre-processed using a pre-trained tweet tokenizer, Valence Aware Dictionary for Sentiment Reasoning (VADER) was used for the sentiment and opinion mining, Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) was used for topic modelling. In contrast, Multidimensional scaling (MDS) was used to visualize the modelled topics. Results Results showed that "yahoo" appeared in the corpus 9555 times, 175 unique tweets were returned after duplicate removal. Contrary to expectation, Spain had the highest number of participants tweeting on the 'yahoo yahoo' topic within the period. The result of Vader sentiment analysis returned 35.85%, 24.53%, 15.09%, and 24.53%, negative, neutral, no-zone, and positive sentiment tweets, respectively. The word yahoo was highly representative of the LDA topics 1, 3, 4, 6, and LSI topic 1.   Conclusion It can be concluded that emojis are even more representative of the sentiments in tweets faster than the textual contents. Also, despite popular belief, a significant number of youths regard cybercrime as a detriment to society.

背景 社交媒体舆论已成为短时间内快速获取任意主题海量、优质且内容丰富的信息的重要渠道。作为社交微博平台,推特(Twitter)每月在全球各国产生超3.3亿条推文。对推特上的热门话题进行分析,有助于挖掘大众对各类议题的多元观点,获取有价值的认知。 研究目的 本研究旨在通过对精选历史推文的内容分析,探究推特平台上“yahoo-yahoo”热门话题的相关舆论。 研究方法 本研究采用橙数据挖掘工具箱(Orange Data Mining)的组件与工作流引擎完成全部文本挖掘任务。通过“yahoo yahoo”话题标签,从推特平台共收集到5500条推文。语料库经预训练推文分词器处理后,采用情感推理词典(Valence Aware Dictionary for Sentiment Reasoning, VADER)开展情感与舆论挖掘;使用潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)与潜在语义索引(Latent Semantic Indexing, LSI)进行话题建模;并采用多维尺度分析(Multidimensional Scaling, MDS)对建模得到的话题进行可视化。 研究结果 研究结果显示,语料库中“yahoo”一词共出现9555次;经去重处理后得到175条唯一推文。与预期相悖的是,统计时段内西班牙参与该“yahoo yahoo”话题推文的用户数量最多。经情感推理词典(VADER)情感分析后,负面、中性、无明确情感倾向与正面情感的推文占比分别为35.85%、24.53%、15.09%及24.53%。“yahoo”一词高度代表了LDA话题1、3、4、6以及LSI话题1的核心内容。 结论 本研究可得出如下结论:表情符号比文本内容更能快速且准确地反映推文中的情感倾向。此外,与大众普遍认知不同的是,相当数量的年轻人认为网络犯罪对社会具有危害性。
创建时间:
2022-08-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作