Atlas of nascent RNA transcripts reveals enhancer to gene linkages
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10223321
下载链接
链接失效反馈官方服务:
资源简介:
Data associated with the paper "Atlas of nascent RNA transcripts reveals enhancer to gene linkages"
GitHub repository for the analyses: https://github.com/Dowell-Lab/DBNascent_Analysis
Below are the summaries of the files associated with this publication.
1. muMerge calls for each paper used in the merging
paper_mumerge_calls
- The calls are separated by the bidirectional caller (dreg, tfit)
- In the folders are bed files (e.g. Allen2014global_hg38_dreg_MUMERGE.bed) for each paper and species (hg38, mm10)
2. Base content for regions called by dREG and Tfit in each paper in mouse and human
mumerge_base_composition
- The base content for each paper after the first round of muMerge
- The files contain the id and the base nucleotide content in 300bp around the center region (id, cg, at)
3. Bidirectional regions called by Tfit and dREG after merging. Regions are for mouse and human datasets. (See https://github.com/Dowell-Lab/bidirectionals_merged)
bidirectional_regions
- Bidirectional regions called after muMerge and filtering
- Calls for both human and mouse datasets are reported (hg38_tfit_dreg_bidirectionals.bed.gz and mm10_tfit_dreg_bidirectionals.bed.gz)
- The bed files are in bed6 format with the following columns:
chromosome, start, stop, bidirectional, number of papers a bidirectional was called, strand (it is . since bidirectionals are not stranded)
4. Metadata for samples used in the SPECS and correlation analysis
metadata
- Sample metadata for filtered samples (human_samples_QC_GC_protocol_filtered.tsv.gz) in the downstream analyses
5. SPECS scores across genes and bidirectional regions
specs_scores
- The SPECS scores for all tissues analyzed (filt_qc123_all_specs_all.txt.gz) are reported,
- Along with the maximum (filt_qc123_all_specs_maxval.txt.gz)
- And minimum SPECS scores (filt_qc123_all_specs_minval.txt.gz).
- The SPECS scores were also split by disease vs non-disease samples
- The TPMs summaries are also included
6. Normalized counts
normalized_counts
- Gene and bidirectional region normalized counts (gene_bidir_tpm.tsv.gz)
7. Bidirectional Region and gene pairs (See https://github.com/Dowell-Lab/bidir_gene_pairs)
bidirectional_gene_pairs
- Gene and bidirectional region pairs (dbnascent_pairs.txt.gz) across tissues in high-quality samples.
- The pairs are reported in a bed12 file
- Where the first 6 columns are gene coordinates and the following 6 are bidirectional coordinates.
- The remaining columns are the summary statistics for correlation and the relationship between the gene and bidirectional.
- Additional columns note whether the pair overlaps eQTLs from GTEx (eQTL) or polII ChIA-PET loops
transcript1_chrom : Gene chromosome
transcript1_start : Gene start coordinate
transcript1_stop : Gene stop coordinate
transcript_1 : Gene id
transcript1_score : Gene score (. since none was assigned)
transcript1_strand : Gene strand
transcript2_chrom : Bidirectional chromosome
transcript2_start : Bidirectional start coordinate
transcript2_stop : Bidirectional stol coordinate
transcript_2 : Bidirectional id
transcript2_score : Bidirectional score (i.e. the number of papers that support a bidirectional from muMerge)
transcript2_strand : Bidirectional strand (. since these are not stranded)
pcc : Pearsons correlation coefficient
pval : P-value
adj_p_BH : Adjusted p-value (Benjamini-Hochberg correction)
nObs : Number of observations in correlation analysis
t : T statistic
distance_tss : Distance between the gene start (TSS) and the bidirectional start coordinate
distance_tes : Distance between the gene stop (TES) and the bidirectional start coordinate
position : Is the bidirectional upstream or downstream of the TSS
tissue : Tissue id based on metadata for tissue-derived correlations (labeled All_samples if all samples are used)
percent_transcribed_both : Percent of the number of observed samples used in the analysis
pair_id : Gene:Transcript~Bidirectional pair name
gene_id : Gene id
chiapet : Binary indicator for whether pair overlaps overlap polII ChIA-PET
gtex : Bindary Indicator whether a pair is overlapping GTEx pairs
创建时间:
2024-12-18



