five

The impact of genetically controlled splicing on exon inclusion and protein structure

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7275061
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains raw and processed files used in Einson et. al 2022.  Descriptions of files contained within each sub directory 01_raw_psi {tissue_id}_v8.psi.tsv.gz: Unfiltered PSI output from IPSA-nf, per tissue. See methods for details about how files were created.  gtex_v8_exon_id_map.tsv: Mapping file between exon coordinates and Ensembl gene IDs, with suffix used in GTEx v8 gencode annotation.  02_qtl_results cross_tissue top_sQTLs_MAF05.tsv: List of top sQTLs across tissues, with one exon and top variant per tissue. See methods for details. See matching file for column descriptions.  top_sQTLs_median_psi.tsv: The median, mean, and standard deviation of PSI of each significant exon from the previous file, taken across all individuals from GTEx with data available. top_sQTLs_MAF05_w_anc_allele.tsv: List of top sQTLs across tissues, with additional columns for the top ψQTL ancestral and derived alleles, where available.  per_tissue {tissue_id}_combined_sQTLs.tsv.gz: Raw output of ψQTL calling using QTLtools in grouped permutational mode per tissue, with groups specified by gene. See methods for more details, and https://qtltools.github.io/qtltools/ for column descriptions.  03_qtl_credible_sets  GTEx_psi_{tissue_id}.collapsed.txt.gz: Output of the QTL catalog fine mapping pipeline (https://github.com/eQTL-Catalogue/qtlmap), run on all exons and tissues, and collapsed using the procedure described in Methods.  04_qtl_coloc combined_coloc_results_full.tsv.gz: Combined output of running coloc on ψQTLs from the 18 GTEx tissues against 87 sets of GWAS summary statistics. This file contains all results, including non-significant associations. A nominal QTLtools pass was used as input. We do not include these files in this repository due to size limitations, but contact the authors if you need access to nominal QTL calls.  top_sQTLs_with_top_coloc_event.tsv: The QTLs in top_sQTLs_MAF05.tsv with additional columns for the GWAS with the highest posterior probability of a colocalization event. Importantly, the tissue and top variant may not match the main top_sQTLs_MAF05.tsv file for every gene.  05_exon_features: See matching files for description of each column.  cross_tissue_constitutive_exons_with_AF.tsv: Detailed features of cross tissue constitutive exons. See methods for definition of constitutive exons.  cross_tissue_nonsignificant_genes_with_AF.tsv: Detailed features of sufficiently variable exons with no significant variant across tissues. See methods for more details.  top_sQTLs_MAF05_with_AF.tsv: Detailed features of top sQTLs.  top_sQTLs_with_top_coloc_with_AF.tsv: Detailed features of sQTLs that colocalize with at least one GWAS trait. Contains columns for Euclidean distances between PAE matrices and RMSD between isoforms, among genes with a significant GWAS colocalization event.  06_predicted_structures: Each prediction was run 5 times, and we report the best model in the manuscript.  {protein.id}[_mutant].result {protein.id}[_mutant]{_run.id}_coverage.png.gz: Plot of the number of sequences per position in MSA {protein.id}[_mutant]{_run.id}_PAE.png.gz: PAE matrix plots for each model {protein.id}[_mutant]{_run.id}_plddt.png.gz: pLDDT plots for each model {protein.id}[_mutant]{_run.id}_predicted_aligned_error_v1.json.gz: A PAE matrix for the best model using AlphaFold-DB's format {protein.id}[_mutant]{_run.id}_unrelaxed_rank_{rank.num}_model_{model.num}_scores.json.gz: Per model array (list of lists) with PAE, a list of the average pLDDT and the pTM score.  {protein.id}[_mutant]{_run.id}_unrelaxed_rank_{rank.num}_model_{model.num}_pdb.gz: Per model predicted structure in pd format {protein.id}[_mutant]{_run.id}.a3m.gz: A3M formatted input MSA cite.bibtex: BibTex file with citations for all used tools and databases config.json: Model input parameters 07_other_data cross_tissue_constitutive_exons.tsv: List of exons that are constitutively spliced across multiple tissues. See methods for details.  cross_tissue_nonsignificant_genes.tsv: List variably spliced exons with no significant sVariant in any tissue. See methods for details.  gtex_v8_exon_id_map.rds: rds representation of a map between exon IDs, as used in the modified version of gencode v26, and exon hg38 coordinates.  gtex_v8_n_exons_per_gene.tsv: Number of exons per gene, as annotated in the modified version of gencode v26 used in GTEx v8.
创建时间:
2023-07-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作