Sparse Biclustering of Transposable Data

Name: Sparse Biclustering of Transposable Data
Creator: Taylor & Francis
Published: 2020-09-04 20:55:23
License: 暂无描述

DataCite Commons2020-09-04 更新2024-07-25 收录

下载链接：

https://tandf.figshare.com/articles/dataset/Sparse_Biclustering_of_Transposable_Data/1209699/1

下载链接

链接失效反馈

官方服务：

资源简介：

We consider the task of simultaneously clustering the rows and columns of a large transposable data matrix. We assume that the matrix elements are normally distributed with a bicluster-specific mean term and a common variance, and perform biclustering by maximizing the corresponding log-likelihood. We apply an ℓ1 penalty to the means of the biclusters to obtain sparse and interpretable biclusters. Our proposal amounts to a sparse, symmetrized version of k-means clustering. We show that k-means clustering of the rows and of the columns of a data matrix can be seen as special cases of our proposal, and that a relaxation of our proposal yields the singular value decomposition. In addition, we propose a framework for biclustering based on the matrix-variate normal distribution. The performances of our proposals are demonstrated in a simulation study and on a gene expression dataset. This article has supplementary material online.

本文针对大型可转置数据矩阵（transposable data matrix）的行与列同步聚类任务展开研究。本文假设矩阵元素服从正态分布，且带有双聚类（biclustering）专属均值项与公共方差，并通过最大化对应对数似然完成双聚类。我们对双聚类的均值施加ℓ₁范数惩罚（ℓ₁ penalty），以获得稀疏且可解释的双聚类结果。本文所提方法等价于k-means聚类的稀疏对称化版本。我们证明，对数据矩阵的行与列分别进行k-means聚类可视为本文所提方法的特例，且对本文方法进行松弛可得到奇异值分解（singular value decomposition）。此外，我们提出了基于矩阵正态分布（matrix-variate normal distribution）的双聚类框架。我们通过模拟研究与基因表达数据集验证了所提方法的性能表现。本文附带在线补充材料。

提供机构：

Taylor & Francis

创建时间：

2016-01-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集