five

Supporting data for "scGraph2Vec: a deep generative model for gene embedding augmented by Graph Neural Network and single-cell omics data"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/102624
下载链接
链接失效反馈
官方服务:
资源简介:
Exploring the cellular processes of genes from the aspects of biological networks is of great interest to understanding the properties of complex diseases and biological systems. Biological networks, such as protein-protein interaction networks and gene regulatory networks, provide insights into the molecular basis of cellular processes and often form functional clusters in different tissue and disease contexts. <br>We present scGraph2Vec, a deep-learning framework for generating informative gene embeddings. scGraph2Vec extends the variational graph autoencoder framework and integrates single-cell datasets and gene-gene interaction networks. We demonstrate that the gene embeddings are biologically interpretable and enable the identification of gene clusters representing functional or tissue-specific cellular processes. By comparing similar tools, we showed that scGraph2Vec clearly distinguished different gene clusters and aggregated more biologically functional genes. scGraph2Vec can be widely applied in diverse biological contexts. We illustrated that the embeddings generated by scGraph2Vec can infer disease-associated genes from genome-wide association study data (e.g., COVID-19 and Alzheimers Disease), identify additional driver genes in lung adenocarcinoma, and reveal regulatory genes responsible for maintaining or transitioning melanoma cell states. <br>scGraph2Vec not only reconstructs tissue-specific gene networks but also obtains a latent representation of genes implying their biological functions.

从生物网络(biological networks)层面探究基因的细胞过程,对于理解复杂疾病与生物系统的特性具有重要意义。生物网络(如蛋白质-蛋白质相互作用网络(protein-protein interaction networks)与基因调控网络(gene regulatory networks))为解析细胞过程的分子基础提供了洞见,且常于不同组织与疾病背景下形成功能簇(functional clusters)。<br>我们提出scGraph2Vec——一个用于生成信息丰富的基因嵌入(gene embeddings)的深度学习框架(deep-learning framework)。scGraph2Vec扩展了变分图自编码器框架(variational graph autoencoder framework),并整合了单细胞数据集(single-cell datasets)与基因-基因相互作用网络(gene-gene interaction networks)。我们证明,该基因嵌入具有生物学可解释性,能够识别代表功能或组织特异性细胞过程的基因簇(gene clusters)。通过与同类工具对比,我们发现scGraph2Vec能清晰区分不同基因簇,并聚集更多具有生物学功能的基因。<br>scGraph2Vec可广泛应用于多种生物学场景。我们表明,scGraph2Vec生成的嵌入能够从全基因组关联研究数据(genome-wide association study data)(如COVID-19与阿尔茨海默病(Alzheimers Disease))中推断疾病相关基因,识别肺腺癌(lung adenocarcinoma)中的额外驱动基因(driver genes),并揭示负责维持或转变黑色素瘤细胞状态(melanoma cell states)的调控基因。<br>scGraph2Vec不仅能重构组织特异性基因网络(tissue-specific gene networks),还能获取基因的潜在表示(latent representation),该表示隐含了基因的生物学功能。
提供机构:
GigaScience Database
创建时间:
2024-11-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作