Precise Prediction of Promoter Strength Based on a De Novo Synthetic Promoter Library Coupled with Machine Learning
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Precise_Prediction_of_Promoter_Strength_Based_on_a_De_Novo_Synthetic_Promoter_Library_Coupled_with_Machine_Learning/17294021
下载链接
链接失效反馈官方服务:
资源简介:
Promoters are one of the most critical
regulatory elements controlling
metabolic pathways. However, the fast and accurate prediction of promoter
strength remains challenging, leading to time- and labor-consuming
promoter construction and characterization processes. This dilemma
is caused by the lack of a big promoter library that has gradient
strengths, broad dynamic ranges, and clear sequence profiles that
can be used to train an artificial intelligence model of promoter
strength prediction. To overcome this challenge, we constructed and
characterized a mutant library of Trc promoters (Ptrc) using 83 rounds of mutation-construction-screening-characterization
engineering cycles. After excluding invalid mutation sites, we established
a synthetic promoter library that consisted of 3665 different variants,
displaying an intensity range of more than two orders of magnitude.
The strongest variant was ∼69-fold stronger than the original Ptrc and 1.52-fold stronger than a 1 mM isopropyl-β-d-thiogalactoside-driven PT7 promoter,
with an ∼454-fold difference between the strongest and weakest
expression levels. Using this synthetic promoter library, different
machine learning models were built and optimized to explore the relationships
between promoter sequences and transcriptional strength. Finally,
our XgBoost model exhibited optimal performance, and we utilized this
approach to precisely predict the strength of artificially designed
promoter sequences (R2 = 0.88, mean absolute
error = 0.15, and Pearson correlation coefficient = 0.94). Our work
provides a powerful platform that enables the predictable tuning of
promoters to achieve optimal transcriptional strength.
创建时间:
2021-12-20



