five

Expression QTLs for NYGC ALS Consortium Paper

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19636256
下载链接
链接失效反馈
官方服务:
资源简介:
The files below contain nominal and permuted quantitative trait loci (QTL) associations between common genetic variants derived from whole genome sequencing and gene expression phenotypes generated from RNA-seq of post-mortem tissue sections. All QTLs were mapped with TensorQTL. Top association files are gzip-compressed tab-separated variable files - *cis_qtl.txt.gz Nominal association files are stored as Parquet files to save space. These can be converted to text files using the following code snippet: pip install pandas pyarrow conda install -c bioconda htslib # provides bgzip python3 -c " import pandas as pd df = pd.read_parquet('your_file.parquet') df.to_csv('your_file.tsv.gz', sep='\t', index=False) " | bgzip > your_file.tsv.gz   NYGC_all_common_variants_alleles.tsv.gz - Allele information for all SNPs tested in the eQTL analysis Table columns are formatted as follows: Nominal QTL results include all SNP-gene pairs tested using either a 1Mb window from each side of the transcription start site (TSS) of the gene.  phenotype_id - ensembl ID of the gene tested (GENCODE v30) variant_id - SNP tested for association (rsid or chr:position:ref:alt) tss_distance - distance of the SNP to the gene transcription start site (TSS) maf - minor allele frequency in cohort ma_samples - number of samples carrying the minor allele ma_count - total number of minor alleles across individuals pval_nominal - nominal P-value from linear regression slope - slope of the linear regression slope_se - standard error of the slope Top association results include only the top SNP-gene association for each gene. Table columns are formatted as follows: phenotype_id - ensembl ID of the gene tested (GENCODE v30 num_var - total number of variants tested in cis beta_shape1 - first parameter value of the fitted beta distribution beta_shape2 - second parameter value of the fitted beta distribution true_df - effective degrees of freedom the beta distribution approximation pval_true_df - empirical P-value for the beta distribution approximation variant_id - ID of the top variant (rsid or chr:position:ref:alt) tss_distance - distance of the SNP to the gene transcription start site (TSS) ma_samples - number of samples carrying the minor allele ma_count - total number of minor alleles across individuals maf -minor allele frequency in MiGA cohort ref_factor - flag indicating if the alternative allele is the minor allele in the cohort (1 if AF <= 0.5, -1 if not) pval_nominal - nominal P-value from linear regression slope - slope of the linear regression slope_se - standard error of the slope pval_perm - first permutation P-value directly obtained from the permutations with the direct method pval_beta - second permutation P-value obtained via beta approximation. This is the one to use for downstream analysis qval - Storey q-value derived from pval_beta (FDR adjusted) pval_nominal_threshold - nominal P-value threshold for calling a variant-gene pair significant for the gene Allele Information for each variant: CHROM - chromosome position of the variant POS - position of the variant in the chromosome REF - reference allele (GRCh38) ALT - alternative allele (this is the effect allele in the eQTL analysis) ID - variant id (rsid or chr:position:ref:alt)
提供机构:
Zenodo
创建时间:
2026-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作