Twitter Corpus of the #BlackLivesMatter Movement and Counter Protests: 2013 to 2021
收藏arXiv2022-06-07 更新2024-06-21 收录
下载链接:
https://zenodo.org/record/6393539
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为‘Twitter Corpus of the #BlackLivesMatter Movement and Counter Protests: 2013 to 2021’,由国家药物滥用研究所和美国宾夕法尼亚大学共同创建。数据集包含了6390万条推文,涉及#BlackLivesMatter、#AllLivesMatter和#BlueLivesMatter等关键词,覆盖了全球超过100个国家。创建过程中,通过Twitter API收集了从2013年至2021年的推文数据,并使用Latent Dirichlet Allocation (LDA) 技术分析了语言模式。该数据集主要用于研究计算社会科学、通信、政治科学、自然语言处理和机器学习等领域,旨在解决系统性种族主义、社会运动、草根运动、种族不平等、警察暴力和反对运动等问题。
This dataset, titled "Twitter Corpus of the #BlackLivesMatter Movement and Counter Protests: 2013 to 2021", was co-developed by the National Institute on Drug Abuse and the University of Pennsylvania. The corpus contains 63.9 million tweets related to hashtags including #BlackLivesMatter, #AllLivesMatter and #BlueLivesMatter, and covers over 100 countries worldwide. During its development, tweet data from 2013 to 2021 was collected via the Twitter API, and linguistic patterns were analyzed using the Latent Dirichlet Allocation (LDA) technique. This dataset is primarily used for research in fields such as computational social science, communication, political science, natural language processing and machine learning, aiming to address issues including systemic racism, social movements, grassroots movements, racial inequality, police brutality and counter-movements.
提供机构:
国家药物滥用研究所
创建时间:
2020-09-02
搜集汇总
背景与挑战
背景概述
该数据集是一个大规模Twitter语料库,涵盖2013年至2021年间与#BlackLivesMatter运动及反对抗议相关的6390万条推文,覆盖全球范围,采用LDA技术进行语言分析,主要用于研究社会运动、种族不平等和计算社会科学等议题。
以上内容由遇见数据集搜集并总结生成



