Decomposed matrices used for the analysis described in 'Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology'

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://figshare.com/articles/dataset/Decomposed_matrices_used_for_the_analysis_described_in_Components_of_genetic_associations_across_2_138_phenotypes_in_the_UK_Biobank_highlight_adipocyte_biology_/9202247

下载链接

链接失效反馈

官方服务：

资源简介：

The dataset deposited here contains decomposed matrices of GWAS summary statistics across 2,138 phenotypes described in the following publication: Y. Tanigawa*, J. Li*, et al., Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology. Nature Communications (2019). doi:10.1038/s41467-019-11953-9. The data are provided as three Python Numpy data (npz) files, each of which corresponds to the three datasets used in computational analysis described in our manuscript. - "all" dataset: dev_allNonMHC_z_center_p0001_100PCs_20180129.npz - "Coding only" dataset: dev_codingNonMHC_z_center_p0001_100PCs_20180129.npz - "PTVs only" dataset: dev_PTVsNonMHC_z_center_p0001_100PCs_20180129.npz Those files can be loaded with Python numpy package and were used in our analysis scripts and notebook (https://github.com/rivas-lab/public-resources/tree/master/uk_biobank/DeGAs). Please read our publication for more information regarding this dataset. Abstract Population-based biobanks with genomic and dense phenotype data provide opportunities for generating effective therapeutic hypotheses and understanding the genomic role in disease predisposition. To characterize latent components of genetic associations, we applied truncated singular value decomposition (DeGAs) to matrices of summary statistics derived from genome-wide association analyses across 2,138 phenotypes measured in 337,199 White British individuals in the UK Biobank study. We systematically identified key components of genetic associations and the contributions of variants, genes, and phenotypes to each component. As an illustration of the utility of the approach to inform downstream experiments, we report putative loss of function variants, rs114285050 (GPR151) and rs150090666 (PDE3B), that substantially contribute to obesity-related traits, and experimentally demonstrate the role of these genes in adipocyte biology. Our approach to dissect components of genetic associations across the human phenome will accelerate biomedical hypothesis generation by providing insights on previously unexplored latent structures.

本数据集存储内容为针对2138种表型的全基因组关联分析（GWAS, Genome-Wide Association Study）汇总统计量的分解矩阵，相关研究已发表于以下论文： Y. Tanigawa*、J. Li*等：《英国生物银行（UK Biobank）2138种表型的遗传关联组分揭示脂肪细胞生物学机制》，《自然-通讯》（Nature Communications），2019年。DOI: 10.1038/s41467-019-11953-9。本数据集以三份Python NumPy（npz）格式文件提供，分别对应本研究手稿中计算分析所使用的三个数据集： - 全量（all）数据集：dev_allNonMHC_z_center_p0001_100PCs_20180129.npz - 仅编码区（Coding only）数据集：dev_codingNonMHC_z_center_p0001_100PCs_20180129.npz - 仅疑似功能丧失变异（PTVs only）数据集：dev_PTVsNonMHC_z_center_p0001_100PCs_20180129.npz 上述文件可通过Python NumPy库加载，本研究的分析脚本与交互式笔记本可参见：https://github.com/rivas-lab/public-resources/tree/master/uk_biobank/DeGAs。如需了解该数据集的更多信息，请查阅我们发表的研究论文。摘要拥有基因组与高密度表型数据的群体生物样本库，为生成有效的治疗假说以及解析基因组在疾病易感性中的作用提供了重要契机。为刻画遗传关联的潜在组分，我们对英国生物银行研究中337199名英国白人个体的2138种表型的全基因组关联分析汇总统计量矩阵，应用了截断奇异值分解（DeGAs, Truncated Singular Value Decomposition）方法。我们系统鉴定了遗传关联的关键组分，以及各类变异、基因与表型在每个组分中的贡献。为展示该方法指导下游实验的实用性，我们报道了可显著影响肥胖相关性状的疑似功能丧失变异rs114285050（GPR151）与rs150090666（PDE3B），并通过实验证实了这两个基因在脂肪细胞生物学中的功能。我们解析人类表型组遗传关联组分的方法，将通过揭示此前未被探索的潜在结构，加速生物医学假说的生成。

创建时间：

2019-08-01