five

A Data Quality Multidimensional Model for Social Media Analysis

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10636894
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset comprises the data used in the paper for assessing the quality of several metrics in determining the relevance of the users. The datasets consists of data extracted from Twitter for the automotive domain, where the query consisted in several brands and models of cars. We provide three datasets: users_all_metrics2.txt User_id, statuses, listed, friends, followers,  tweets on domain (dataset), Screen name, User language, User location, Verified account (True/False), Coherence of profile (entropy of text under domain model), #Performed actions, #Received actions tweets_all_metrics.txt.gz Tweet_id, replies, retweets, favourites, User_id, statuses, listed, friends, followers, tweets on domain (dataset), Screen name, User language, User location, Verified account (True/False), Coherence of profile, Date of publication (created_at), Tweet Language, processed text, coherence of text, repetitions of text in collection, user's received actions, user's generated actions, text polarity, number of facts, number of linked opinion expressions, number of linked entities relevant_new.txt Screen names of the users deemed relevant for the domain   Datasets are "|"-separeted text files with no header provided (see table above for the name of the columns).
创建时间:
2024-02-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作