five

DataSheet1_netMUG: a novel network-guided multi-view clustering workflow for dissecting genetic and facial heterogeneity.DOCX

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet1_netMUG_a_novel_network-guided_multi-view_clustering_workflow_for_dissecting_genetic_and_facial_heterogeneity_DOCX/24751422
下载链接
链接失效反馈
官方服务:
资源简介:
Introduction: Multi-view data offer advantages over single-view data for characterizing individuals, which is crucial in precision medicine toward personalized prevention, diagnosis, or treatment follow-up. Methods: Here, we develop a network-guided multi-view clustering framework named netMUG to identify actionable subgroups of individuals. This pipeline first adopts sparse multiple canonical correlation analysis to select multi-view features possibly informed by extraneous data, which are then used to construct individual-specific networks (ISNs). Finally, the individual subtypes are automatically derived by hierarchical clustering on these network representations. Results: We applied netMUG to a dataset containing genomic data and facial images to obtain BMI-informed multi-view strata and showed how it could be used for a refined obesity characterization. Benchmark analysis of netMUG on synthetic data with known strata of individuals indicated its superior performance compared with both baseline and benchmark methods for multi-view clustering. The clustering derived from netMUG achieved an adjusted Rand index of 1 with respect to the synthesized true labels. In addition, the real-data analysis revealed subgroups strongly linked to BMI and genetic and facial determinants of these subgroups. Discussion: netMUG provides a powerful strategy, exploiting individual-specific networks to identify meaningful and actionable strata. Moreover, the implementation is easy to generalize to accommodate heterogeneous data sources or highlight data structures.

引言:多视图数据在刻画个体特征方面相较单视图数据具备显著优势,这对于面向个性化预防、诊断或治疗随访的精准医学(precision medicine)领域而言至关重要。 方法:本研究开发了一款名为netMUG的网络引导式多视图聚类框架,用以识别具有临床可操作性的个体亚群。该分析流程首先采用稀疏多重典型相关分析(sparse multiple canonical correlation analysis)筛选可能受外源数据辅助的多视图特征,随后利用这些特征构建个体特异性网络(individual-specific networks, ISNs)。最终,通过对这些网络表征进行层次聚类,自动推导出个体亚型。 结果:我们将netMUG应用于包含基因组数据与面部图像的数据集,以获取基于身体质量指数(Body Mass Index, BMI)的多视图分层,并展示了其用于精细化肥胖特征刻画的潜力。在带有已知个体分层的合成数据上开展的基准测试表明,相较于多视图聚类的基线方法与基准对比方法,netMUG的性能更优。相较于合成数据的真实标签,netMUG得到的聚类结果调整兰德指数(adjusted Rand index)达到1。此外,真实数据分析结果显示,识别出的亚群与BMI密切相关,同时也揭示了这些亚群的遗传与面部特征决定因素。 讨论:netMUG提供了一种强有力的分析策略,通过利用个体特异性网络来识别具有生物学意义且可操作的分层亚群。此外,该实现框架易于推广,可适配异构数据源,或突出数据的结构特征。
创建时间:
2023-12-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作