Expression QTLs for NYGC ALS Consortium Paper
收藏DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19636256
下载链接
链接失效反馈官方服务:
资源简介:
The files below contain nominal and permuted quantitative trait loci (QTL) associations between common genetic variants derived from whole genome sequencing and gene expression phenotypes generated from RNA-seq of post-mortem tissue sections. All QTLs were mapped with TensorQTL.
Top association files are gzip-compressed tab-separated variable files - *cis_qtl.txt.gz
Nominal association files are stored as Parquet files to save space. These can be converted to text files using the following code snippet:
pip install pandas pyarrow
conda install -c bioconda htslib # provides bgzip
python3 -c "
import pandas as pd
df = pd.read_parquet('your_file.parquet')
df.to_csv('your_file.tsv.gz', sep='\t', index=False)
" | bgzip > your_file.tsv.gz
NYGC_all_common_variants_alleles.tsv.gz - Allele information for all SNPs tested in the eQTL analysis
Table columns are formatted as follows:
Nominal QTL results include all SNP-gene pairs tested using either a 1Mb window from each side of the transcription start site (TSS) of the gene.
phenotype_id - ensembl ID of the gene tested (GENCODE v30)
variant_id - SNP tested for association (rsid or chr:position:ref:alt)
tss_distance - distance of the SNP to the gene transcription start site (TSS)
maf - minor allele frequency in cohort
ma_samples - number of samples carrying the minor allele
ma_count - total number of minor alleles across individuals
pval_nominal - nominal P-value from linear regression
slope - slope of the linear regression
slope_se - standard error of the slope
Top association results include only the top SNP-gene association for each gene. Table columns are formatted as follows:
phenotype_id - ensembl ID of the gene tested (GENCODE v30
num_var - total number of variants tested in cis
beta_shape1 - first parameter value of the fitted beta distribution
beta_shape2 - second parameter value of the fitted beta distribution
true_df - effective degrees of freedom the beta distribution approximation
pval_true_df - empirical P-value for the beta distribution approximation
variant_id - ID of the top variant (rsid or chr:position:ref:alt)
tss_distance - distance of the SNP to the gene transcription start site (TSS)
ma_samples - number of samples carrying the minor allele
ma_count - total number of minor alleles across individuals
maf -minor allele frequency in MiGA cohort
ref_factor - flag indicating if the alternative allele is the minor allele in the cohort (1 if AF <= 0.5, -1 if not)
pval_nominal - nominal P-value from linear regression
slope - slope of the linear regression
slope_se - standard error of the slope
pval_perm - first permutation P-value directly obtained from the permutations with the direct method
pval_beta - second permutation P-value obtained via beta approximation. This is the one to use for downstream analysis
qval - Storey q-value derived from pval_beta (FDR adjusted)
pval_nominal_threshold - nominal P-value threshold for calling a variant-gene pair significant for the gene
Allele Information for each variant:
CHROM - chromosome position of the variant
POS - position of the variant in the chromosome
REF - reference allele (GRCh38)
ALT - alternative allele (this is the effect allele in the eQTL analysis)
ID - variant id (rsid or chr:position:ref:alt)
提供机构:
Zenodo
创建时间:
2026-05-04



