Data from: Tracking time evolution of collective attention clusters in twitter: time evolving nonnegative matrix factorisation
收藏DataONE2015-09-29 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Micro-blogging services, such as Twitter, offer opportunities to analyse user behaviour. Discovering and distinguishing behavioural patterns in micro-blogging services is valuable. However, it is difficult and challenging to distinguish users, and to track the temporal development of collective attention within distinct user groups in Twitter. In this paper, we formulate this problem as tracking matrices decomposed by Nonnegative Matrix Factorisation for time-sequential matrix data, and propose a novel extension of Nonnegative Matrix Factorisation, which we refer to as Time Evolving Nonnegative Matrix Factorisation (TENMF). In our method, we describe users and words posted in some time interval by a matrix, and use several matrices as time-sequential data. Subsequently, we apply Time Evolving Nonnegative Matrix Factorisation to these time-sequential matrices. TENMF can decompose time-sequential matrices, and can track the connection among decomposed matrices, whereas previous NMF decomposes a matrix into two lower dimension matrices arbitrarily, which might lose the time-sequential connection. Our proposed method has an adequately good performance on artificial data. Moreover, we present several results and insights from experiments using real data from Twitter.
以Twitter为代表的微博客服务,为用户行为分析提供了研究契机。挖掘并区分微博客平台中的用户行为模式具有重要研究价值。然而,在Twitter平台中区分用户身份,并追踪不同用户群体内集体注意力的时序演化过程,仍存在较大难度与挑战。本文将该问题建模为针对时序矩阵数据的、基于非负矩阵分解(Nonnegative Matrix Factorisation, NMF)的分解矩阵追踪问题,并提出了非负矩阵分解的一种新型扩展方法——时序演化非负矩阵分解(Time Evolving Nonnegative Matrix Factorisation, 简称TENMF)。在本方法中,我们将特定时间区间内的用户与发布词汇构建为矩阵,并以多组矩阵构成时序数据。随后将时序演化非负矩阵分解应用于该时序矩阵集合。TENMF可对时序矩阵进行分解,同时追踪分解后矩阵间的关联关系;而传统非负矩阵分解仅能将单个矩阵任意分解为两个低维矩阵,易丢失时序关联信息。所提方法在人工合成数据集上取得了良好的实验性能。此外,本文还基于Twitter真实数据集开展了实验,给出了多项实验结果与研究见解。
创建时间:
2015-09-29



