five

Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11211925
下载链接
链接失效反馈
官方服务:
资源简介:
The below data are associated with our paper entitled "Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance." 1) SNP-gene link predictions generated by pgBoost and existing methods SCENT (Sakaue et al. 2024 Nat Genet), Signac (Stuart et al. 2021 Nat Methods), ArchR (Granja et al. 2021 Nat Genet), and Cicero (Pliner et al. 2018 Mol Cell). pgBoost_scores.tsv.gz contains linking predictions made by pgBoost. constituent_method_scores.tsv.gz contains linking predictions made by constituent methods. Linking scores and percentiles are reported for each method (pgBoost score, SCENT FDR, Signac correlation, ArchR correlation, Cicero co-accessibility). Rank percentiles are computed as: 1 - (rank / n). When multiple links receive the same score, they are assigned the percentile of the top rank. Links unscored by each method (denoted by zeros* in the linking score column) are assigned a percentile equivalent to the percent of links unscored by the focal method. See the Methods section of the paper for further details on computing linking scores and summarizing scores across cell types and data sets. *Candidate links tested and assigned a co-accessibility of zero by the Cicero method are given a score of 1e-100 in the "Cicero" column to distinguish between unscored candidate links and candidate links assigned a partial correlation of zero (see Pliner et al. 2018 Mol Cell). NOTE: The predictions associated with this release (version 2) were generated using an expanded set of data sets, an expanded training set, and corrected TSS coordinates. 2) GWAS-derived evaluation SNP-gene link evaluation set. gwas_evaluation.tsv: GWAS-derived evaluation SNP-gene link evaluation set. Column 1 provides SNP coordinates in the format . This evaluation framework was proposed by Weeks et al. 2024 Nature Genetics based on fine-mapping results from Kanai et al. medrxiv (see Methods: Evaluation data sets of Dorans et al.). "True" links (gold = 1) are non-coding variants fine-mapped to a focal trait (PIP > 0.1) with a coding variant for exactly one candidate gene within 1 Mb attaining PIP > 0.5 for the same trait. "False" links (gold = 0) are candidate SNP-gene pairs involving a SNP with a "true" link. This file of SNP-gene links was adapted from credible set-gene links here (the "truth" column defines true/false links) by identifying SNPs with PIP > 0.1 within each credible set-gene link.
创建时间:
2025-03-03
二维码
社区交流群
二维码
科研交流群
商业服务