GMIP-PLSR Reference Data: Multi-Omics Feature Matrices and LD Reference Files for Post-GWAS Gene Prioritization
收藏DataCite Commons2026-05-03 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19986368
下载链接
链接失效反馈官方服务:
资源简介:
Reference data for the GMIP-PLSR pipeline (https://github.com/mohammedmsk/GMIP), a Nextflow pipeline for post-GWAS gene prioritization via multi-omics integration and Partial Least Squares Regression.
This archive contains:
1000 Genomes Phase 3 EUR LD reference files (plink format) for MAGMA SNP-to-gene mapping
Full PoPS multi-omics feature matrices (munged, per-chromosome format)
S-LDSC baseline model v1.1, plink bfiles, frequency files, and LD weights for benchmarking evaluation
Pre-computed gene window LD score files (100 kb and 50 kb windows)
NAFLD (Miao et al.) example GWAS feature matrices used in the accompanying preprint
The archive also contains static_gits/, which holds pinned source copies of two third-party tools used by the pipeline:
LDSC (Bulik-Sullivan et al., Nature Genetics 2015) — MIT License (https://github.com/bulik/ldsc)
Benchmarker (Finucane Lab) — MIT License (https://github.com/FinucaneLab/benchmarker)
These are included to ensure reproducibility of the exact software versions used in the preprint analyses.
Associated preprint: Kanchwala MS et al. "GMIP-PLSR: A Nextflow Pipeline for GWAS and Multi-Omics Integration in Gene Prioritization Using PLSR" bioRxiv (2026). doi:10.64898/2026.04.06.716845
Usage: Download and extract with the setup_references.sh script provided in the GitHub repository, or upload to S3 for cloud execution.
提供机构:
Zenodo
创建时间:
2026-05-03



