five

Metascape results for Prostate cancer multiomics data

收藏
ieee-dataport.org2025-03-23 收录
下载链接:
https://ieee-dataport.org/documents/metascape-results-prostate-cancer-multiomics-data
下载链接
链接失效反馈
官方服务:
资源简介:
Large p small n problem is a challenging problem in big data analytics. There are no de facto standard methods available to it. In this study, we propose a tensor decomposition (TD) based unsupervised feature extraction (FE) formalism applied to multiomics datasets, where the number of features is more than 100000 while the number of instances is as small as about 100. The proposed TD based unsupervised FE outperformed other conventional supervised feature selection methods, such as random forest, categorical regression (also known as analysis of variance, ANOVA), and penalized linear discriminant analysis when they are applied to not only multiomics datasets but also synthetic datasets. Genes selected by TD based unsupervised FE were biologically reliable. TD based unsupervised FE turned out to be not only the superior feature selection method but also the method that can select biologically reliable genes.

大规模小样本问题是大数据分析领域中的一个难题,目前尚无公认的标准方法可以解决。在本研究中,我们提出了一种基于张量分解(Tensor Decomposition,简称TD)的无监督特征提取(Feature Extraction,简称FE)形式化方法,应用于多组学数据集,其中特征数量超过10万个,而实例数量却小至约100个。基于TD的无监督FE方法在应用于多组学数据集以及合成数据集时,均优于其他传统监督特征选择方法,如随机森林、分类回归(亦称方差分析,ANOVA)和惩罚线性判别分析。通过TD无监督FE选出的基因具有生物学可靠性。结果证明,基于TD的无监督FE不仅是一种卓越的特征选择方法,而且能够筛选出生物学上可靠的基因。
提供机构:
ieee-dataport.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作