Semi-simulated TF pertubation in ATAC-seq datasets (GC content bias)
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10781758
下载链接
链接失效反馈官方服务:
资源简介:
Semi-simulated ATAC-seq data with TF pertubations of different pertubation strength & GC content biases. The three .tar-files contain:
DTFAB_sim_gc_beds.tar.gz: .bed files of the semi-simulated ATAC-seq fragments
DTFAB_sim_gc_cm.tar.gz: count matrices (per peak counts of semi-simulated fragments)
DTFAB_sim_gc_peaks.tar.gz: .bed files with ATAC-seq peak coordinates of semi-simulated fragments
The folder structure of 1.,2. & 3. is of the following format ___FALSE_. The original files, for which biases & pertubations were introduced, were retrieved from ENCODE with following IDs: ENCFF495DQP, ENCFF130DND, ENCFF447ZRG, ENCFF966ELR, ENCFF358GWK, ENCFF963YZH. ChIP-seq peaks used to introduce perturbations correspond to following ENCODE identifiers: ENCFF592UDD, ENCFF250FJC.
This semi-simulated datasets is belongs to a series of datasets:
Dataset
Description
DOI
I. pertubation strength
TF pertubation with different strengths, no biases introduced
10.5281/zenodo.10732704
II. pos control
TF pertubation only introduced in (ChIP-) peaks with a motif of the respective TF.
10.5281/zenodo.10781849
III. fld
TF pertubation with additionally introduced fragment length distribution bias
10.5281/zenodo.10781109
IV. gc (this)
TF pertubation with additionally introduced GC content bias
10.5281/zenodo.10781759
带有不同扰动强度与GC含量偏倚的转录因子(Transcription Factor, TF)扰动的半模拟转座酶可及性染色质测序(ATAC-seq)数据。其包含三个.tar压缩包,具体内容如下:
DTFAB_sim_gc_beds.tar.gz:存储半模拟ATAC-seq片段的.bed格式文件
DTFAB_sim_gc_cm.tar.gz:存储计数矩阵(半模拟片段的每峰计数结果)
DTFAB_sim_gc_peaks.tar.gz:存储包含半模拟片段ATAC-seq峰坐标的.bed格式文件
上述1至3项的文件夹结构遵循格式___FALSE_。用于引入偏倚与扰动的原始数据从ENCODE数据库获取,对应的编号为:ENCFF495DQP、ENCFF130DND、ENCFF447ZRG、ENCFF966ELR、ENCFF358GWK、ENCFF963YZH。用于引入扰动的ChIP-seq峰对应的ENCODE标识符为:ENCFF592UDD、ENCFF250FJC。
本半模拟数据集隶属于以下数据集系列:
I. 扰动强度系列:施加不同强度的TF扰动,未引入任何偏倚,DOI:10.5281/zenodo.10732704
II. 阳性对照:仅在对应TF基序的(ChIP)峰中引入TF扰动,DOI:10.5281/zenodo.10781849
III. 片段长度分布偏倚系列:额外引入片段长度分布偏倚的TF扰动数据集,DOI:10.5281/zenodo.10781109
IV. GC含量偏倚系列(本数据集):额外引入GC含量偏倚的TF扰动数据集,DOI:10.5281/zenodo.10781759
创建时间:
2024-03-06



