five

Network Models and Content Models for Social Media Users, Domains of Interest, and User Similarity

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://doi.org/10.7910/DVN/ELI5OZ
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset containing processed Twitter users information for content and graph analysis. Users are grouped in 4 manually-defined communities (fashion, finance, Australian writers and chess players) and an additional set of 160 random users is included in the collection. Content features are calculated with NLTK and Dandelion: each user is represented by a vector of words and entities occurrences, based on their tweets. Networks are built using SNAP, network features are extracted using node2vec: for each domain, the following, followers and mentions graph is calculated and stored in two separated files for edges and nodes; additionally, the embeddings are stored for each node in a separated file.
创建时间:
2019-07-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作