five

Multi-dimensional author profiling by business roles

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10601198
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the data used in the paper "Multidimensional Author Profiling for Social Business Intelligence", more specifically, the gold standard (GS) and silver standard (SS) created for training and validating text classifiers for business profiling of social network users. The GS dataset is a CSV file with the following columns: screen-name, user-id, verified-user (boolean), multi-level-label, manual-verification, textual-description, followers (int), friends (int), source (not used) The attribute "multi-level label" contains label represeting the user business profile, regarding the three perspectives: role, colective-vs-individual, and on-domain ones. The attribute "manual-verification" is a second pass from experts to validate the assigned label.   The SS dataset is a "|"-separated text file with the following columns: screen-name|user-id|verified-user|multi-level-label|textual-description   The SS dataset is generated with an unsupervised method through an initial seed of bigrams. Therefore, the dataset can contain wrong and incomplete labels, hence the name silver standard (SS).   As data is captured from Twitter, we can only relase it under restricted conditions.
创建时间:
2024-03-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作