five

Data from: Tracking time evolution of collective attention clusters in twitter: time evolving nonnegative matrix factorisation

收藏
DataONE2015-09-29 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
Micro-blogging services, such as Twitter, offer opportunities to analyse user behaviour. Discovering and distinguishing behavioural patterns in micro-blogging services is valuable. However, it is difficult and challenging to distinguish users, and to track the temporal development of collective attention within distinct user groups in Twitter. In this paper, we formulate this problem as tracking matrices decomposed by Nonnegative Matrix Factorisation for time-sequential matrix data, and propose a novel extension of Nonnegative Matrix Factorisation, which we refer to as Time Evolving Nonnegative Matrix Factorisation (TENMF). In our method, we describe users and words posted in some time interval by a matrix, and use several matrices as time-sequential data. Subsequently, we apply Time Evolving Nonnegative Matrix Factorisation to these time-sequential matrices. TENMF can decompose time-sequential matrices, and can track the connection among decomposed matrices, whereas previous NMF decomposes a matrix into two lower dimension matrices arbitrarily, which might lose the time-sequential connection. Our proposed method has an adequately good performance on artificial data. Moreover, we present several results and insights from experiments using real data from Twitter.

以Twitter为代表的微博客服务,为用户行为分析提供了重要的研究契机。挖掘并区分微博客平台中的用户行为模式具有显著的研究价值。然而,在Twitter平台中区分不同用户群体,并追踪各群体内集体注意力的时序演化过程,仍是一项兼具难度与挑战性的任务。本文将该问题建模为针对时序矩阵数据的非负矩阵分解(Nonnegative Matrix Factorisation, NMF)分解结果追踪问题,并提出了一种全新的非负矩阵分解扩展方法——时序演化非负矩阵分解(Time Evolving Nonnegative Matrix Factorisation, TENMF)。在本文所提方法中,我们将特定时间区间内的用户与发布词汇以矩阵形式表征,并以多组矩阵构成时序数据集;随后将时序演化非负矩阵分解应用于该类时序矩阵数据。TENMF可实现时序矩阵的分解,并追踪各分解结果间的内在关联;而传统非负矩阵分解仅能将单个矩阵随机分解为两个低维矩阵,极易丢失其时序关联信息。所提方法在人工合成数据集上展现出了优异的性能表现。此外,本文还基于Twitter真实数据集开展了实验,给出了多项实验结果与针对性的研究启示。
创建时间:
2015-09-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作