Haplotype Analysis Reveals Pleiotropic Disease Associations in the HLA Region Manuscript Supplementary Tables
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12763468
下载链接
链接失效反馈官方服务:
资源简介:
Supplementary Table 1: Description of all 2,459 FinnGen diseases included in the association analyses, including the disease categories, number of cases, and ICD codes. The first column (``NAME'') corresponds to the short name assigned by FinnGen, the second (``TAGS'') is the group tag assigned by FinnGen, ``LONGNAME'' is the full name of the disease, ``HD_ICD_10'' corresponds to ICD code 10, ``HD_ICD_9'' is ICD code 9, ``HD_ICD_8'' is ICD code 8, ``category'' is the FinnGen description of the disease tag, ``num_cases'' is the number of cases for that disease, ``num_controls'' is the number of controls, ``lambda'' is the lambda from the FinnGen GWAS runs, ``HLA_hits'' is the number of conditionally independent SNP associations identified in this study within the HLA boundaries, ``non_HLA_hits'' is the number of fine-mapped variants for all SNPs outside the HLA region, ``group'' is the groupings used for the enrichment analysis based on the FinnGen groupings, ``ManualCategory'' is the manually curated groupings based on shared pathophysiology for the diseases identified as HLA associated by the SNP analysis. For all diseases where the pathophysiologic mechanism underlying the disease was unknown or if the disease can have multiple causes, the ManualCategory is recorded as ``Organ''. ``ManualSubcategory'' is the same as ``ManualCategory'' except for all diseases where the ``ManualCategory'' is recorded as ``Organ'' then this lists the part of the body/organ primarily affected. ``Plot'' is the concise interpretable name used to plot the 269 diseases included in the main haplotype group regression analysis, results of which are shown in Figure 5. In addition, full details for how FinnGen defined each disease is available at \url{https://r10.risteys.finngen.fi/}.
Supplementary Table 2: Enrichment of GWAS hits in the HLA region for each disease group with at least 10 diseases in the group, for all diseases in FinnGen with at least one GWAS hit (MAF > 1\%) anywhere in the genome. The first column (``group'') corresponds to the disease group, the second column (``enrich'') has the enrichment for that disease group, the third column (``n'') has the number of diseases in that disease group, the fourth column (``HLA_hits'') has the mean number of independent SNP associations in the HLA region for diseases in that disease group, the fifth column (``non_HLA_hits'') has the mean number of independent SNP associations outside the HLA region for diseases in that disease group, the sixth column (``se_HLA'') has the standard error of the number of independent SNPs in the HLA region across diseases in that disease group, and the last column (``se_non_HLA'') has the standard error of the number of independent SNPs outside the HLA region across diseases in that disease group. The second tab has the same format but includes results from the repeated analysis using the 644 diseases that remained after randomly removing one disease for each pair with an LDSC genetic correlation > 0.95.
Supplementary Table 3: Regression results for the SNP-disease associations for significant SNP associations remaining after step-wise conditional analysis in the HLA region. The first column has the FinnGen short name for the disease, the second column has the longer name of the disease, the third has the SNP ID, the fourth has the position of the SNP, the fifth has the reference allele, the sixth has the alternative allele, the seventh has the allele frequency of the alternative allele, the eighth has the rsID of the SNP, and the next five columns have the beta, standard error, Z-score, p-value for the association of that SNP with that disease. The next column (``nearest_genes'') has the nearest gene to the SNP, followed by (column ``round'') the conditional analysis round the SNP was found to be independently significant with the disease, and then (``annot'') the variant annotation for the SNP. This table includes the 1,064 disease associations from the full SNP conditional analysis with all 572 diseases. The conditional analysis focusing on just the 269 diseases included in the main analysis, after removing redundant traits, resulted in 540 disease associations across 428 unique SNPs.
Supplementary Table 4: Haplotype group and individual haplotype statistics and assignments. The first tab (``allhaplotype_group_stats'') has the total doses of each haplotype group present in the dataset for each block. Tabs 2-4 have the haplotype information with one tab for each block. The first 1000 columns corresponds to the 1000 SNPs in the haplotype, where 0 corresponds to the reference allele and 1 corresponds to the alternative allele, then the second to last column (``total_doses'') has the total doses of that haplotype, and the last column (``haplotype_group'') has the haplotype group that each haplotype belonged to for that block. Total doses for individual haplotypes are included for all haplotypes with > 10 total doses for FinnGen privacy policy reasons.
Supplementary Table 5: Results from the haplotype groups association analyses across all 3 blocks. The first tab (``main_hapgroup_reg_results'') has the full results from the main haplotype group association analysis, with the first column (``traits'') indicating the disease, the second (``hapgroup'') referring to the haplotype group, the next column (``Z_rescaled'') referring to the values of the regression Z-scores rescaled to add back in the dropped haplotype group for each block, the next column (``plotted'') corresponding to whether or not that association is plotted in the heatmap of Figure \ref{fig:fig5}, and the final column (``block'') indicating the block in the HLA region that the haplotype group was identified in. The next two tabs have the (non-rescaled) regression results for all diseases, with (tab called ``allregresults_adjallelesinblock'') and without joint modeling to condition on the relevant classical HLA alleles in the block (tab called ``all_hapgroup_reg_results_sig'').
Supplementary Table 6: Regression results for all significant allele associations for all diseases, for both the approach jointly modeling alleles within a given block together (tab 1; tab called ``alleles_vifindep_joint'') and for the approach with one individual allele per regression (tab 2; tab called ``alleles_indiv''). For both tabs, the first column has the disease name, the next four columns have the beta, standard error, Z-score, and p-value respectively for the association of that allele with that disease, the sixth column has the allele. For the first tab the last column has the block number that the allele is within.
Supplementary Table 7: Haplotype disease association results in UK Biobank. Tab 1 (called ``ukbhapassociations'') has the results for blocks 1-3 for the associations plotted in the UKBB heatmap, with the first column (``traits'') indicating the disease code, the second (``hapgroup'') referring to the haplotype group, the next column (``Z'') referring to the values of the regression Z-scores, the next column (``LONG_NAME'') referring to the long name for the disease, the next column (``N'') has the number of cases for that disease, the next column (``Plot'') has the concise interpretable name used for the heatmap plot label, and the last column (``block'') indicating the block the haplotype group belonged to. Tab 2 (called ``hapgroupdoses'') has the number of haplotypes included in each haplotype group.
Supplementary Table 8: Haplotype disease association results in UK Biobank using UK Biobank haplotypes mapped onto the original FinnGen haplotype groups. Tab 1 has the UK Biobank results for blocks 1-3 comparing the Z-scores for the equivalent haplotype group disease association in FinnGen, with the first column (``block'') indicating the block the haplotype group belonged to, the second (``hapgroup'') referring to the haplotype group, the third column (``finngen_trait'') indicating the disease code in FinnGen, the next column (``finngen_Z'') referring to the values of the FinnGen regression Z-scores, the next column (``ukb_trait'') indicating the disease code in UK Biobank, and the last column (``ukb_Z'') with the values of the UK Biobank regression Z-scores. Tab 2 has the total doses in UK Biobank for each of the original FinnGen haplotype groups. Tab 3 has the replication analysis results for the original Pearson's correlation analysis across haplotype groups for all pairwise combinations of the diseases mapped between FinnGen and UKBB.
创建时间:
2025-04-12



