five

Ensemble INtegration Suite – EINS Manuscript Datasets (Ensemble Multi-Omics Integration Increases Robustness and Interpretability)

收藏
Figshare2026-03-13 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Ensemble_INtegration_Suite_EINS_Manuscript_Datasets_Ensemble_Multi-Omics_Integration_Increases_Robustness_and_Interpretability_/31700017
下载链接
链接失效反馈
官方服务:
资源简介:
Integrating multi-omics data such as genomics, proteomics, metabolomics, and lipidomics provides deeper insights into cellular biology and disease than single-omics analyses. The diversity of available integration strategies, each based on distinct statistical frameworks, often produces heterogeneous, difficult-to-compare results. We developed Ensemble INtegration Suite (EINS), a framework that integrates outputs from numerous multi-omics integration methods for both subtyping and biomarker discovery, i.e. at the sample and embedding levels. For validation we leveraged the Cancer Cell Line Encyclopedia (CCLE) dataset with 313 cell lines characterized across five omics layers: DNA methylation, miRNA expression, transcriptomics, proteomics, and metabolomics. We also applied EINS to selected lipid transfer protein (LTP) knockout models as a case study. the raw data as well as the preprocessed data used in EINS are available in this repository. We validated EINS using extensively characterized cancer cell lines from the Cancer Cell Line Encyclopedia. In unsupervised clustering, EINS was consistently able to recover the original number of primary sites of the cell lines when tested with various combinations based on the ARI and NMI scores – unlike all other methods. In supervised validation, we classified 313 cancer cell lines into 17 primary sites applying six methods – Random Forest, SVM, XGBoost, Logistic Regression, LASSO and Elastic Net – with EINS F1-macro score surpassing all other methods. We applied EINS to proteomics and lipidomics characterization data from knockout cell line models of four lipid transfer proteins. The integration revealed clusters sharing biological functions across the omics layers and offered new insights unavailable from either proteomics or lipidomics alone. We believe that ensemble integration of multi-omics enhances the robustness and interpretability of results, providing a more reliable approach for uncovering multi-omics molecular patterns. EINS is available from https://github.com/EnsembleMultiomics/EINS.
创建时间:
2026-03-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作