Supporting data for "Lifting the curse from high-dimensional data: Automated projection pursuit clustering for the variety of biological data modalities"

Name: Supporting data for "Lifting the curse from high-dimensional data: Automated projection pursuit clustering for the variety of biological data modalities"
Creator: GigaScience Database
Published: 2025-06-09 11:00:48
License: 暂无描述

DataCite Commons2025-06-09 更新2025-04-15 收录

下载链接：

http://gigadb.org/dataset/102687

下载链接

链接失效反馈

官方服务：

资源简介：

Unsupervised clustering is a powerful machine-learning technique widely used to analyze high-dimensional biological data. It plays a crucial role in uncovering patterns, structure, and inherent relationships within complex datasets without relying on predefined labels. In the context of biology, high-dimensional data may include transcriptomics, proteomics, and a variety of single-cell omics data. Most existing clustering algorithms operate directly in the high-dimensional space, and their performance may be negatively affected by the phenomenon known as the curse of dimensionality. Here, we show an alternative clustering approach that alleviates the curse by sequentially projecting high-dimensional data into a low-dimensional representation. We validated the effectiveness of our approach, named APP, across various biological data modalities, including flow and mass cytometry data, scRNA-seq, multiplex imaging data, and T-cell receptor repertoire data. APP efficiently recapitulated experimentally validated cell-type definitions and revealed new biologically meaningful patterns.

提供机构：

GigaScience Database

创建时间：

2025-04-01