Automation and machine learning drive rapid optimization of isoprenol production in Pseudomonas putida

NIAID Data Ecosystem2026-05-02 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.gtht76hzh

下载链接

链接失效反馈

官方服务：

资源简介：

Advances in genome engineering have improved our ability to perturb microbial metabolic networks, yet bioproduction campaigns often struggle with parsing complex metabolic datasets to efficiently enhance product titers. We address this challenge by coupling laboratory automation with machine learning to systematically optimize the production of isoprenol, a sustainable aviation fuel (SAF) precursor, in Pseudomonas putida. The simultaneous downregulation through CRISPR interference of combinations of up to four gene targets, guided by machine learning (ML), permitted us to increase isoprenol titer 5-fold in six consecutive DBTL cycles. Moreover, ML enabled us to swiftly explore a vast experimental design space of 800,000 possible combinations by strategically recommending approximately 400 priority constructs. High-throughput proteomics allowed us to validate CRISPRi downregulation and identify biological mechanisms driving production increases. Our work demonstrates that ML-driven automated DBTL cycles can rapidly enhance titers without specific biological knowledge, suggesting that it can be applied to any host, product, or pathway. Methods High-throughput proteomics data were generated to monitor the effects of CRISPRi-mediated gene knockdowns on protein expression levels across six DBTL cycles (DBTL0-DBTL6). The sample preparation protocol is detailed at Protocols.io dx.doi.org/10.17504/protocols.io.6qpvr6xjpvmk/v1. Protein was extracted from P. putida cell pellets using Qiagen P2 Lysis Buffer, precipitated with acetone, and digested with trypsin. Resulting tryptic peptides were analyzed using an Agilent 1290 UHPLC system coupled to a Thermo Scientific Orbitrap Exploris 480 mass spectrometer, employing data-independent acquisition (DIA) mode. The data processing protocol is detailed at Protocols.io dx.doi.org/10.17504/protocols.io.5qpvobk7xl4o/v2. DIA raw data were processed using DIA-NN software (library-free mode) against a database containing the P. putida KT2440 Uniprot proteome, heterologous proteins, and common contaminants. Protein quantification was performed using the Top3 method (Ahrne et al. 2013 DOI:10.1002/pmic.201300135), averaging the signal response of the three most intense tryptic peptides for each protein. Data were filtered to a global false discovery rate (FDR) ≤ 0.01 at both precursor and protein group levels. The generated mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD063733 (DBTL0), PXD063737 (DBTL1), PXD063738 (DBTL2), PXD063740 (DBTL3), PXD063743 (DBTL4), PXD063744 (DBTL5), and PXD063746 (DBTL6). DIA-NN is freely available for download from https://github.com/vdemichev/DiaNN.

创建时间：

2025-08-20