Allele-specific chromatin accessibility for 23 cancer types and weight files for Regulome Wide Association-Studies (RWAS)
收藏Mendeley Data2024-05-17 更新2024-06-27 收录
下载链接:
https://zenodo.org/records/6371439
下载链接
链接失效反馈官方服务:
资源简介:
stratAS_results.tar.gz This folder contains the results of analyses of allele-specific chromatin accessibility for 23 cancer types + pan-cancer using the stratAS software. The analyses identified allelically imbalanced genomic regions from 406 cancer ATAC-Seq samples. The files are named after each cancer type (or “pancancer” for the pan-cancer analysis) and contain the following columns: CHR - Chromosome POS - Position of test SNP RSID - ID of test SNP P0 - Start of gene/peak P1 - End of gene/peak NAME - Name of gene/peak CENTER - Center position of peak (or TSS for gene) N.HET - Number of heterozygous individuals tested N.READS - Number of reads tested in total ALL.AF - Allelic fraction estimate from beta binomial test across both conditions ALL.BBINOM.P - Beta-binomial test for imbalance across both conditions C0.AF - Allelic fraction estimate from condition 0 C0.BBINOM.P - Beta-binomial test for imbalance in condition 0 C1.AF - Allelic fraction estimate from condition 1 C1.BBINOM.P - Beta-binomial test for imbalance in condition 1 DIFF.BBINOM.P - Beta-binomial test for difference between conditions IND.C0 - Number of each condition 0 individual included in this test (comma separated) IND.C0.COUNT.REF - condition 0 REF allele counts of each individual included in this test (comma separated) IND.C0.COUNT.ALT - condition 0 ALT allele counts of each individual included in this test (comma separated) IND.C1 - Number of each condition 1 individual included in this test (comma separated) IND.C1.COUNT.REF - condition 1 REF allele counts of each individual included in this test (comma separated) IND.C1.COUNT.ALT - condition 1 ALT allele counts of each individual included in this test (comma separated) In our analysis, samples that belong to a cancer type that is analyzed for allelic imbalance are assigned to Condition 1 (C1) while samples belonging to the remaining 22 cancer types are assigned to Condition 0 (C0). For the pan-cancer analysis, samples from all 23 cancer types are assigned to C1. P-values denoting the significance of allelic imbalance in a given cancer type (or pan-cancer analysis) are listed in the column C1.BBINOM.P. P-values denoting the significance of differential allelic imbalance between a given cancer type and the remaining 22 cancer types are listed in column DIFF.BBINOM.P (set to NA for pan-cancer analysis since all samples are assigned to C1). More information about stratAS and its output format can be found at https://github.com/gusevlab/stratAS. The conducted analyses are described in detail in the methods section of Grishin D. and Gusev A. Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms (2022). RWAS_weights.tar.gz This folder contains weight files that can be used together with GWAS summary statistics to conduct Regulome Wide Association-Studies (RWAS) using the FUSION software package. Each weight file corresponds to a single accessible element in the FUSION format. Instructions on how to use FUSION for RWAS can be found at http://gusevlab.org/projects/fusion/. RWAS is conducted in the same manner as TWAS using these RWAS weights. The weight files were generated using a pan-cancer peak set. A BED file containing pan-cancer peaks can be found in the peaks_hg19 folder along with other BED files containing cancer-type specific peaks. Cancer-type specific peak files can be used to restrict the RWAS analysis to regulatory elements active in specific cancer types. The conducted analyses are described in detail in the methods section of Grishin D. and Gusev A. Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms (2022).
stratAS_results.tar.gz
该压缩包包含使用stratAS软件针对23种癌症类型+泛癌开展的等位基因特异性染色质可及性(allele-specific chromatin accessibility)分析结果。本分析从406个癌症ATAC测序(ATAC-Seq)样本中鉴定出存在等位基因失衡的基因组区域。
分析文件以对应癌症类型命名(泛癌分析使用“pancancer”作为文件名),各文件包含以下列:
CHR:染色体
POS:检测单核苷酸多态性(SNP)的位置
RSID:检测SNP的标识
P0:基因/峰的起始位点
P1:基因/峰的终止位点
NAME:基因/峰的名称
CENTER:峰的中心位置(若为基因则为转录起始位点TSS)
N.HET:检测的杂合个体数量
N.READS:总检测读段数
ALL.AF:双条件下β二项式检验得到的等位基因分数估计值
ALL.BBINOM.P:双条件下等位基因失衡的β二项式检验P值
C0.AF:条件0下的等位基因分数估计值
C0.BBINOM.P:条件0下等位基因失衡的β二项式检验P值
C1.AF:条件1下的等位基因分数估计值
C1.BBINOM.P:条件1下等位基因失衡的β二项式检验P值
DIFF.BBINOM.P:不同条件间等位基因失衡差异的β二项式检验P值
IND.C0:本次检验纳入的各条件0个体编号(逗号分隔)
IND.C0.COUNT.REF:本次检验纳入的各条件0个体的参考等位基因读段计数(逗号分隔)
IND.C0.COUNT.ALT:本次检验纳入的各条件0个体的变异等位基因(ALT)读段计数(逗号分隔)
IND.C1:本次检验纳入的各条件1个体编号(逗号分隔)
IND.C1.COUNT.REF:本次检验纳入的各条件1个体的参考等位基因读段计数(逗号分隔)
IND.C1.COUNT.ALT:本次检验纳入的各条件1个体的变异等位基因(ALT)读段计数(逗号分隔)
在本分析中,针对某癌症类型开展等位基因失衡分析的样本被划归为条件1(C1),其余22种癌症类型的样本则被划归为条件0(C0)。对于泛癌分析,全部23种癌症类型的样本均被划归为C1。
某癌症类型(或泛癌分析)的等位基因失衡显著性P值列于C1.BBINOM.P列中。某癌症类型与其余22种癌症类型之间的等位基因失衡差异显著性P值列于DIFF.BBINOM.P列(泛癌分析中该列设为NA,因所有样本均被划归为C1)。
有关stratAS软件及其输出格式的更多信息可访问https://github.com/gusevlab/stratAS获取。本项分析的详细方法描述见Grishin D.与Gusev A.于2022年发表的论文《Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms》。
RWAS_weights.tar.gz
该压缩包包含权重文件,可配合全基因组关联研究(Genome-Wide Association Study, GWAS)汇总统计数据使用,借助FUSION软件包开展调控组全关联研究(Regulome Wide Association Studies, RWAS)。每个权重文件对应FUSION格式下的一个可及性元件。有关如何使用FUSION开展RWAS的说明可访问http://gusevlab.org/projects/fusion/获取。
使用本RWAS权重文件开展RWAS的流程与转录组全关联研究(Transcriptome Wide Association Studies, TWAS)一致。
本权重文件基于泛癌峰集生成。包含泛癌峰的BED格式文件(BED file)可于peaks_hg19文件夹中获取,该文件夹同时包含各癌症类型特异性峰的BED格式文件。癌症类型特异性峰文件可用于将RWAS分析限定于特定癌症类型中活跃的调控元件。
本项分析的详细方法描述见Grishin D.与Gusev A.于2022年发表的论文《Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms》。
创建时间:
2023-06-28



