Multi-dimensional author profiling by business roles
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10601198
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the data used in the paper "Multidimensional Author Profiling for Social Business Intelligence", more specifically, the gold standard (GS) and silver standard (SS) created for training and validating text classifiers for business profiling of social network users.
The GS dataset is a CSV file with the following columns:
screen-name, user-id, verified-user (boolean), multi-level-label, manual-verification, textual-description, followers (int), friends (int), source (not used)
The attribute "multi-level label" contains label represeting the user business profile, regarding the three perspectives: role, colective-vs-individual, and on-domain ones. The attribute "manual-verification" is a second pass from experts to validate the assigned label.
The SS dataset is a "|"-separated text file with the following columns:
screen-name|user-id|verified-user|multi-level-label|textual-description
The SS dataset is generated with an unsupervised method through an initial seed of bigrams. Therefore, the dataset can contain wrong and incomplete labels, hence the name silver standard (SS).
As data is captured from Twitter, we can only relase it under restricted conditions.
创建时间:
2024-03-15



