five

Datasets for predicting TF binding using Virtual ChIP-seq

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/823296
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains datasets necessary for using the Virtual ChIP-seq software. Virtual ChIP-seq requires the following datasets to predict transcription factor binding: chipExpDir_AtoH_V1.0.0.tar.gz: Reference matrices of correlation between TF binding and gene expression for TFs starting with letters A-H. chipExpDir_ItoZ_V1.0.0.tar.gz: Reference matrices of correlation between TF binding and gene expression for TFs starting with letters I-Z. refTables_V1.1.0.tar.gz: PhastCons genomic conservation, FIMO PWM scores for JASPAR motifs, and ChIP-seq data of ENCODE and Cistrome database. hg38_chrsize.tsv: Length of chromosomes in hg38 trainedModels_V1.0.0.tar.gz: Virtual ChIP-seq scikit-learn trained models saved in joblib format .tar.gz: Pre-calculated matrices suitable for training with other algorithms or re-training with Virtual ChIP-seq. Some predictive features of TF binding are the same in each cell type and are stored together for simplicity in refTables_V1.0.0.tar.gz. You can use datasets from other cell types (named here as  .tar.gz) for the purpose of re-training the model. The .tar.gz files contain pre-calculated predictive features of transcription factor binding in 4 chromosomes (5, 10, 15, 20). These features include: PhastCons genomic conservation FIMO score for sequence motifs of TF in the JASPAR database Chromatin accessibility TF binding in ENCODE + Cistrome DB datasets Virtual ChIP-seq expression score
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作