Source data of the MultiSTAAR manuscript "A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies".

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14213841

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset serves as the source data for Figures 2-3 and Extended Data Figures 1-2 of the MultiSTAAR manuscript titled "A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies". MultiSTAAR is a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies.Figure 2. Manhattan plots and Q-Q plots for unconditional gene-centric coding, noncoding and ncRNA multi-trait analysis of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C) and triglycerids (TG) using TOPMed data (n = 61,838).Figure 3. TOPMed genetic region (2-kb sliding window) unconditional multi-trait analysis results of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C) and triglycerides (TG) using TOPMed data (n = 61,838).Extended Data Figure 1. Manhattan plots and Q-Q plots for unconditional gene-centric coding, noncoding and genetic region (2-kb sliding window) multi-trait analysis of fasting glucose (FG) and fasting insulin (FI) using TOPMed data (n = 21,731).Extended Data Figure 2. Manhattan plots and Q-Q plots for unconditional gene-centric coding, noncoding and genetic region (2-kb sliding window) multi-trait analysis of C-reactive protein (CRP), interleukin 6 (IL-6), lipoprotein-associated phospholipase A2 (Lp-PLA2) activity, and lipoprotein-associated phospholipase A2 (Lp-PLA2) mass using TOPMed data (n = 9,380).

本数据集为题为《大规模全基因组测序研究中多性状罕见变异分析的统计框架》的MultiSTAAR论文的图2至图3以及扩展数据图1至图2的源数据。MultiSTAAR是一款面向大规模全基因组测序（Whole-Genome Sequencing, WGS）研究中基于功能注释的多性状罕见变异分析的统计框架与计算可扩展分析流程。图2 基于TOPMed数据（样本量n=61838），针对低密度脂蛋白胆固醇（Low-Density Lipoprotein Cholesterol, LDL-C）、高密度脂蛋白胆固醇（High-Density Lipoprotein Cholesterol, HDL-C）与甘油三酯（Triglyceride, TG）开展的无条件以基因为中心的编码区、非编码区及非编码RNA（ncRNA）多性状分析的曼哈顿图与Q-Q图。图3 基于TOPMed数据（样本量n=61838），针对低密度脂蛋白胆固醇（LDL-C）、高密度脂蛋白胆固醇（HDL-C）与甘油三酯（TG）开展的TOPMed遗传区域（2kb滑动窗口）无条件多性状分析结果。扩展数据图1 基于TOPMed数据（样本量n=21731），针对空腹血糖（Fasting Glucose, FG）与空腹胰岛素（Fasting Insulin, FI）开展的无条件以基因为中心的编码区、非编码区及遗传区域（2kb滑动窗口）多性状分析的曼哈顿图与Q-Q图。扩展数据图2 基于TOPMed数据（样本量n=9380），针对C反应蛋白（C-Reactive Protein, CRP）、白细胞介素6（Interleukin 6, IL-6）、脂蛋白相关磷脂酶A2（Lipoprotein-Associated Phospholipase A2, Lp-PLA2）活性及脂蛋白相关磷脂酶A2（Lp-PLA2）质量开展的无条件以基因为中心的编码区、非编码区及遗传区域（2kb滑动窗口）多性状分析的曼哈顿图与Q-Q图。

创建时间：

2024-11-25