COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes

Name: COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes
Creator: 高性能计算研究所，科学、技术和研究机构（A*STAR），新加坡
Published: 2022-06-25 14:35:40
License: 暂无描述

arXiv2022-06-25 更新2024-06-21 收录

下载链接：

https://doi.org/10.3886/E120321

下载链接

链接失效反馈

官方服务：

资源简介：

COVID-19 Twitter数据集由新加坡高性能计算研究所创建，包含2020年1月28日至2022年6月1日期间超过2.5亿条推文，涉及2900万独立用户。数据集通过使用自然语言处理技术，为每条推文标记了17个属性，包括10个二进制属性表示推文与10个检测到的主题的相关性，5个定量情感属性表示情感强度，以及2个分类属性表示情感和主导情绪。该数据集旨在支持多学科研究，如通信、心理学、公共卫生、经济学和流行病学，以理解和应对COVID-19大流行带来的复杂问题。

The COVID-19 Twitter Dataset was developed by the Institute of High Performance Computing, Singapore. It contains over 250 million tweets from 29 million unique users, covering the period from January 28, 2020 to June 1, 2022. Using natural language processing (NLP) techniques, 17 attributes are annotated for each tweet, including 10 binary attributes indicating the relevance of the tweet to 10 detected topics, 5 quantitative sentiment attributes representing sentiment intensity, and 2 categorical attributes for sentiment and dominant emotion. This dataset is intended to support multidisciplinary research across fields such as communication, psychology, public health, economics, and epidemiology, to understand and address the complex issues arising from the COVID-19 pandemic.

提供机构：

高性能计算研究所，科学、技术和研究机构（A*STAR），新加坡

创建时间：

2020-07-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集