five

Superdiversity dataset

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6367082
下载链接
链接失效反馈
官方服务:
资源简介:
The Superdiversity dataset includes the Superdiversity Index (SI) calculated on the diversity of the emotional content expressed in texts of different communities. The emotional valences of words used by a community are extracted from Twitter data produced by that specific community. The Superdiversity dataset includes the SI built on Twitter data and lexicon-based Sentiment Analysis. In addition, the dataset comprises other possible diversity measures calculated from the same data from which the SI is calculated, such as the number of tweets in the community language and the Type-Token Ratio, the number of languages in a community. Version 1.1: Data is computed for nine different nations France, Germany, Ireland, Italy, the Netherlands, Poland, Portugal, Spain and the United Kingdom.   Note The SI ranges in [0, 1]: a value of 0 means an emotional content very close between the computed valences and a standard emotional lexicon.  a value of 0.5 indicates no correlation between the emotional content of words used by the community on Twitter and the standard emotional content. a value of 1 would correspond to the use of terms with the opposite emotional content compared to the standard. Data is computed at different three geographical scales based on the Classification of Territorial Units for Statistics (NUTS), i.e., NUTS1, NUTS2, and NUTS3, for two different nations, Italy and the United Kingdom.
创建时间:
2022-03-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作