Supporting data for "D-EE: a distributed software for visualizing intrinsic structure of large-scale single-cell data"
收藏DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100815
下载链接
链接失效反馈官方服务:
资源简介:
Dimensionality reduction and visualization play vital roles in single-cell RNA sequencing (scRNA-seq) data analysis. While they have been extensively studied, state-of-the-art dimensionality reduction algorithms are often unable to preserve the global structures underlying data. Elastic Embedding (EE), a nonlinear dimensionality reduction method, has shown promise in revealing low-dimensional intrinsic local and global data structure. However, the current implementation of the EE algorithm lacks scalability to large-scale scRNA-seq data.<br>We present a distributed optimization implementation of the EE algorithm, termed distributed Elastic Embedding (D-EE). D-EE reveals the low-dimensional intrinsic structures of data with accuracy equal to that of Elastic Embedding, and it is scalable to large-scale scRNA-seq data. It leverages distributed storage and distributed computation, achieving memory efficiency and high-performance computing simultaneously. In addition, an extended version of D-EE, termed distributed optimization implementation of time series Elastic Embedding (D-TSEE), enables the user to visualize large-scale time series scRNA-seq data by incorporating experimental temporal information. Results with a large-scale scRNA-seq data indicate D-TSEE can uncover oscillatory gene expression patterns by employing experimentally temporal information.<br>D-EE is a distributed dimensionality reduction and visualization tool. Its distributed storage and distributed computation technique allow us to efficiently analyze large-scale single-cell data at the cost of constant time speedup. The source code for D-EE algorithm based on C and MPI tailored to a High Performance Computing cluster is available from our GitHub archive.
提供机构:
GigaScience Database
创建时间:
2020-10-13



