five

ChromBPNet models and data: Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8299710
下载链接
链接失效反馈
官方服务:
资源简介:
This record contains ChromBPNet models and data used to train the models for the paper "Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency" by Nair, Ameen et al. `data` contains bigwigs and regions (peaks + non-peaks) used for training each of the models. See `data/README.txt` for more details. Models: Loading the model: The models were trained using tf1.14. The models are provided in h5 format for tf1.14 (py3.7) and SavedModel format for tf2.X. tf2.X tested only for py3.8-11, tf2.8-13. To load the models in tf1.14: model = tf.keras.models.load_model("path/to/model.h5") In tf2: model = tf.keras.models.load_model("path/to/model_dir") If all fails, you can load the architecture as provided in `model_arch.py` with default parameters (`bpnet_seq` for bias model and `chrombpnet` for chrombpnet model), and then load the weights using `model.load_weights` from the weights provided in the `weights` directory.   Usage: The bias models take as input one-hot sequence of length 2000. It has 2 outputs, a vector of logits of length 2000, and 1 logcounts scalar: # seq_one_hot of length B x 2000 x 4 out_bias_logits, out_bias_logcounts = bias_model.predict(seq_one_hot) # out_bias_logits: B x 2000 # out_bias_logcounts: B x 1 The ChromBPNet model takes as input a one-hot sequence of length 2000, bias logits of length 2000 and bias log-counts scalar. It has the same output types as the bias model. To run the chrombpnet model to obtain predictions: pred_profile, pred_logcounts = chrombpnet_model.predict([seq_one_hot, out_bias_logits, out_bias_logcounts]) # pred_profile: B x 2000 # pred_logcounts: B x 1 If you wish to obtain the "de-biased" predictions (see Methods), simply pass in zeros instead of the bias model predictions as: pred_profile_debiased, pred_logcounts_debiased = chrombpnet_model.predict([seq_one_hot, np.zeros((seq_one_hot.shape[0], 2000)), np.zeros((seq_one_hot.shape[0], 1))]) To obtain predicted per-base predicted counts (with or without bias): pred_per_base_counts = scipy.special.softmax(pred_profile, axis=-1) * (np.exp(pred_logcounts)-1) # pred_per_base_counts: B x 2000 Note that in general predicted counts can't be compared across models as they are not corrected for sequencing depth.   Note: All bias models used across folds are identical, except for the final intercept term in the counts output (see Methods), that is specific to each cell state, fold combination.   Folds: The splits used for training the different folds are as below: Fold Test Chromosomes Validation Chromosomes 0 chr1 chr8, chr10 1 chr2, chr19 chr1 2 chr3, chr20 chr2, chr19 3 chr6, chr13, chr22 chr3, chr20 4 chr5, chr16, chrY chr6, chr13, chr22 5 chr4, chr15, chr21 chr5, chr16, chrY 6 chr7, chr18, chr14 chr4, chr15, chr21 7 chr11, chr17, chrX chr7, chr18, chr14 8 chr9, chr12 chr11, chr17, chrX 9 chr8, chr10 chr9, chr12 Remaining chromosomes were used as the training chromosome for each fold.
创建时间:
2023-11-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作