Cis-regulatory mutations with driver hallmarks in major cancers
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/4kx5sfx9vz
下载链接
链接失效反馈官方服务:
资源简介:
Jan-11-2021
Two types of datasets for each cancer type are provided as follows:
Varscan called somatic mutation data: these are raw somatic mutations called with default Varscan parameters without any filtering.
Gene-level ASE data: these are gene-level ASE based on RNA-seq data of tumor samples.
In our paper, the above data were further filtered and performed association test between gene-level ASE and somatic mutation occurrence within different regulatory regions.
See details in the section of method in our iScience paper:
Zhongshan Cheng, Michael Vermeulen, Micheal Rollins-Green, Brian DeVeale, Tomas Babak. 2021. Cis-regulatory mutations with driver hallmarks in major cancers. iScience.
Dataset Annoations:
Headers for somatic mutation derived from Whole Genome Sequence (WGS) data using the software Varscan
(dataset: Cancer_type_varscan_mutations.csv):
chrom="the chromosome that the mutation is residing in"
position="mutation position on the chromosome (hg19)"
ref="reference allele for the mutation"
var="mutated allele for the mutation"
normal_reads1="sequence reads for the reference allele in normal WGS"
normal_reads2="sequence reads for the mutated allele in normal WGS"
normal_var_freq="variant allele frequency in normal WGS"
normal_gt="normal genotype at this site"
tumor_reads1="sequence reads for the reference allele in tumor WGS"
tumor_reads2="sequence reads for the mutated allele in tumor WGS"
tumor_var_freq="variant allele frequency in tumor WGS"
tumor_gt="tumor genotype at this site"
somatic_p_value="Varscan somatic mutation P value"
gp="TCGA WGS sample ID"
Header for gene-level ASE dataset, 'Cancer_type_gene_level_ASE.csv':
transcript_id="assembled transcript ids for gene-level ASE"
ASE_Reads_Hap1="RNA-seq read sum for phased haplotype 1"
ASE_Read_Hap2="RNA-seq read sum for phased haplotype 2"
SNP_Read_on_Hap1_2="For each SNP phased into two haplotypes, its allele reads on each haplotype"
SNPs="SNPs phased into two haplotypes for the assembled transcript"
SNP_Alleles="Two alleles of each SNP phased into each haplotype"
TCGA_Sample_ID="TCGA RNA-seq ID"
transcript_st="assembled transcript start position (hg19)"
transcript_end="assembled transcript end position (hg19)"
chr="chromosome information for assembled transcript"
创建时间:
2021-01-22



