five

Correlation of secreted protein coding gene expression and gene modules between tissues

收藏
DataCite Commons2024-02-07 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Correlation_of_secreted_protein_coding_gene_expression_and_gene_modules_between_tissues/25170008
下载链接
链接失效反馈
官方服务:
资源简介:
We implemented the proposed computational approach to apply low dimension representations of gene expression and transfer learning to predict long-range tissue crosstalk signals. Tissue crosstalk signals include many proteins circulating in the blood. Discovering the signaling function carried by these signals is difficult, because the source and destination of these signals cannot be easily disaggregated. Cross-tissue co-expression can be used to find some unexpected connection between gene expression from one tissue with that of another tissue and hence generate hypothesis about cross tissue crosstalk. However, this method suffers from low discovery power because of to the very large number of pairwise correlations that need to be made across all genes between two tissues, leading to severe multiple testing burden. The approach we proposed uses a regularized matrix factorization method (PLIER). Which reduces transcriptome data into a lower dimension representation that corresponds to one or few annotated pathways, which has the effect of reducing multiple testing burden in the cross-tissue gene correlation matrix. Moreover, we can generate hypothesis about the downstream signaling function of pairs of source endocrine genes and target tissue latent variables.GTEx v8 contains ~17,000 transcriptome profiles from 54 tissue and cell types. However, two tissues in GTEx may share only limited number of common donors as not every donor had every tissue sequenced. To overcome this small data size, we implemented a transfer learning paradigm, where the PLIER matrices are first trained using a larger transcriptomics data set (the entire GTEx v8 RNA-seq collection of over 17,000 transcriptomes), then projected onto the smaller subset of samples (i.e., matching RNA-seq data from any two pairs if tissues from the same donors), which typically contains tens to hundreds of donors. We then constructed correlation matrices between known secreted proteins in the source tissue with the latent variables in the target tissue. Using a Bonferroni-adjusted P value to correct for the number of correlations (source endocrine proteins x target latent variables) and further adjusted for the number of tissue pairs (54 x 54; i.e. second Bonferroni threshold of adjusted P < 3.4e–6), we found 122,852 significant cross-tissue signals from 1680 tissue pairs (average 73 signals per pair).
提供机构:
figshare
创建时间:
2024-02-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作