five

Atlas of nascent RNA transcripts reveals enhancer to gene linkages

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10223321
下载链接
链接失效反馈
官方服务:
资源简介:
Data associated with the paper "Atlas of nascent RNA transcripts reveals enhancer to gene linkages" GitHub repository for the analyses: https://github.com/Dowell-Lab/DBNascent_Analysis Below are the summaries of the files associated with this publication.    1. muMerge calls for each paper used in the merging paper_mumerge_calls - The calls are separated by the bidirectional caller (dreg, tfit) - In the folders are bed files (e.g. Allen2014global_hg38_dreg_MUMERGE.bed) for each paper and species (hg38, mm10)   2. Base content for regions called by dREG and Tfit in each paper in mouse and human mumerge_base_composition - The base content for each paper after the first round of muMerge - The files contain the id and the base nucleotide content in 300bp around the center region (id, cg, at)   3. Bidirectional regions called by Tfit and dREG after merging. Regions are for mouse and human datasets. (See https://github.com/Dowell-Lab/bidirectionals_merged) bidirectional_regions - Bidirectional regions called after muMerge and filtering - Calls for both human and mouse datasets are reported (hg38_tfit_dreg_bidirectionals.bed.gz and mm10_tfit_dreg_bidirectionals.bed.gz) - The bed files are in bed6 format with the following columns: chromosome, start, stop, bidirectional, number of papers a bidirectional was called, strand (it is . since bidirectionals are not stranded)   4. Metadata for samples used in the SPECS and correlation analysis metadata - Sample metadata for filtered samples (human_samples_QC_GC_protocol_filtered.tsv.gz) in the downstream analyses   5. SPECS scores across genes and bidirectional regions specs_scores - The SPECS scores for all tissues analyzed (filt_qc123_all_specs_all.txt.gz) are reported, - Along with the maximum (filt_qc123_all_specs_maxval.txt.gz) - And minimum SPECS scores (filt_qc123_all_specs_minval.txt.gz). - The SPECS scores were also split by disease vs non-disease samples - The TPMs summaries are also included   6. Normalized counts  normalized_counts - Gene and bidirectional region normalized counts (gene_bidir_tpm.tsv.gz)   7. Bidirectional Region and gene pairs (See https://github.com/Dowell-Lab/bidir_gene_pairs) bidirectional_gene_pairs - Gene and bidirectional region pairs (dbnascent_pairs.txt.gz) across tissues in high-quality samples.  - The pairs are reported in a bed12 file    - Where the first 6 columns are gene coordinates and the following 6 are bidirectional coordinates.    - The remaining columns are the summary statistics for correlation and the relationship between the gene and bidirectional.   - Additional columns note whether the pair overlaps eQTLs from GTEx (eQTL) or polII ChIA-PET loops transcript1_chrom : Gene chromosome transcript1_start : Gene start coordinate transcript1_stop : Gene stop coordinate transcript_1 : Gene id transcript1_score : Gene score (. since none was assigned) transcript1_strand : Gene strand transcript2_chrom : Bidirectional chromosome transcript2_start : Bidirectional start coordinate transcript2_stop : Bidirectional stol coordinate transcript_2 : Bidirectional id transcript2_score : Bidirectional score (i.e. the number of papers that support a bidirectional from muMerge) transcript2_strand : Bidirectional strand (. since these are not stranded) pcc : Pearsons correlation coefficient pval : P-value adj_p_BH : Adjusted p-value (Benjamini-Hochberg correction) nObs : Number of observations in correlation analysis t : T statistic distance_tss : Distance between the gene start (TSS) and the bidirectional start coordinate distance_tes : Distance between the gene stop (TES) and the bidirectional start coordinate position : Is the bidirectional upstream or downstream of the TSS tissue : Tissue id based on metadata for tissue-derived correlations (labeled All_samples if all samples are used) percent_transcribed_both : Percent of the number of observed samples used in the analysis pair_id : Gene:Transcript~Bidirectional pair name gene_id : Gene id chiapet : Binary indicator for whether pair overlaps overlap polII ChIA-PET gtex : Bindary Indicator whether a pair is overlapping GTEx pairs
创建时间:
2024-12-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作