five

The adapted Activity-By-Contact model for enhancer-gene assignment and its application to single-cell data

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/5841991
下载链接
链接失效反馈
官方服务:
资源简介:
In our work, we implemented the ABC-model and could show that one assay for measuring the openness of enhancers is sufficient. Further, we propose an adapted calculation of the ABC-score, which describes enhancer activity in a gene-specific manner without requiring any additional data. We combined our implementation of the ABC-score with an approach to quantify TF binding affinity into STARE: a framework to derive TF affinities to genes. STARE was also designed for potential application on single-cell data. You can find the code in our GitHub repository and more details in our preprint. We provide the data for the validation of our ABC-implementation on two CRISPR-screens. We also provide the results of our analysis of single-cell data of the human heart with STARE. All data is in hg19. Content: K562_CandidateEnhancer: K562 enhancer with the 4th column for enhancer activity, one file for each activity representation that was measured. K562_ABC_Predictions: Regular ABC-scores and adapted ABC-scores for each activity measurement. The files contain all scored interactions for a 10MB window, without any cut-off. We also included the results of the implementation of the ABC-score of Fulco et al. (2019). STARE_Hocker_*: Whole STARE output for human heart single-cell data, one for regular ABC, adapted ABC, adapted ABC with average Hi-C matrix and one based on co-accessibility analysis. All approaches were run with a 5 MB window, the ABC-based runs with a score cut-off of 0.02. Each folder contains two subdirectories, one for the ABC-scoring and one for the Gene-TF affinity matrices. The 'ABC_output' also contains a GeneInfo file for each cell type, summarising different attributes per gene. INVOKE_Hocker_*: Folder with the input and output of INVOKE (see https://github.com/schulzlab/tepic), based on the STARE runs. CS genes stands for cell type-specific genes, defined as genes with a z-score across cell types of ≥ 2 and TPM ≥ 0.5. The INVOKE commands were as follows: Rscript INVOKE.R --dataDir= --outDir= --response=Expression --regularization=E --performance=TRUE --outerCV=10 --seed=1234 FulcoValidation_GeneAnnotation_hg19_gtfStyle.txt: gene annotation from the supplementary material of Fulco et al. (2019) formatted into gtf-style to work with STARE Importantly, the results are based on the following publications:     K562 predictions and average Hi-C matrix: Fulco, C. P. et al. (2019). Activity-by-contact model of enhancer–promoter     regulation from thousands of CRISPR perturbations. Nature Genetics,     51(12), 1664–1669     Hi-C matrix for K562 predictions: Rao, S. et al. (2014). A 3D Map of the Human Genome at Kilobase     Resolution Reveals Principles of Chromatin Looping. Cell, 159(7),     1665–1680     STARE and INVOKE runs: Hocker, J. D. et al. (2021). Cardiac cell type–specific gene regula-     tory programs and disease risk association. Science Advances, 7(20),     eabf1444     H3K27ac HiChIP for STARE runs: Anene-Nzelu, C. G. et al. (2020). Assigning Distal Genomic Enhancers     to Cardiac Disease–Causing Genes. Circulation, 142(9), 910–912     INVOKE software: Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction Schmidt et al., Nucleic Acids Research 2016; doi: 10.1093/nar/gkw1061
创建时间:
2023-02-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作