CML-COVID
收藏arXiv2021-01-29 更新2024-06-21 收录
下载链接:
https://doi.org/10.18738/T8/W1CHVU
下载链接
链接失效反馈官方服务:
资源简介:
CML-COVID数据集是由德克萨斯大学奥斯汀分校的计算媒体实验室创建,包含19,298,967条自2020年3月至7月收集的与COVID-19相关的推文。数据集涵盖了来自5,977,653名独特用户的推文,主要通过查询关键词‘coronavirus’, ‘covid’和‘mask’收集。数据集创建过程中使用了主题建模、情感分析和描述性统计等技术,以分析推文内容及其地理分布。该数据集主要应用于公共卫生领域,旨在研究公众对COVID-19的看法、信息传播模式以及政府和机构应对措施的效果评估。
The CML-COVID Dataset was created by the Computational Media Lab at The University of Texas at Austin. It contains 19,298,967 COVID-19-related tweets collected between March and July 2020. The dataset covers tweets from 5,977,653 unique users, and was primarily gathered using the query keywords 'coronavirus', 'covid', and 'mask'. During its creation, techniques including topic modeling, sentiment analysis, and descriptive statistics were employed to analyze the tweet content and its geographic distribution. This dataset is mainly applied in the public health domain, aiming to investigate public perceptions of COVID-19, information dissemination patterns, and the effectiveness evaluation of government and institutional response measures.
提供机构:
计算媒体实验室,新闻与媒体学院,穆迪传播学院,德克萨斯大学奥斯汀分校
创建时间:
2021-01-29



