A multiple linear regression approach to the estimation of carboxylic acid ester and lactone alkaline hydrolysis rate constants
收藏DataCite Commons2023-04-26 更新2024-08-18 收录
下载链接:
https://tandf.figshare.com/articles/dataset/A_multiple_linear_regression_approach_to_the_estimation_of_carboxylic_acid_ester_and_lactone_alkaline_hydrolysis_rate_constants/22325009
下载链接
链接失效反馈官方服务:
资源简介:
Pesticides, pharmaceuticals, and other organic contaminants often undergo hydrolysis when released into the environment; therefore, measured or estimated hydrolysis rates are needed to assess their environmental persistence. An intuitive multiple linear regression (MLR) approach was used to develop robust QSARs for predicting base-catalyzed rate constants of carboxylic acid esters (CAEs) and lactones. We explored various combinations of independent descriptors, resulting in four primary models (two for lactones and two for CAEs), with a total of 15 and 11 parameters included in the CAE and lactone QSAR models, respectively. The most significant descriptors include p<i>K</i><sub>a</sub>, electronegativity, charge density, and steric parameters. Model performance is assessed using Drug Theoretics and Cheminformatics Laboratory’s DTC-QSAR tool, demonstrating high accuracy for both internal validation (<i>r</i><sup>2</sup> = 0.93 and RMSE = 0.41–0.43 for CAEs; <i>r</i><sup>2</sup> = 0.90–0.93 and RMSE = 0.38–0.46 for lactones) and external validation (<i>r</i><sup>2</sup> = 0.93 and RMSE = 0.43–0.45 for CAEs; <i>r</i><sup>2</sup> = 0.94–0.98 and RMSE = 0.33–0.41 for lactones). The developed models require only low-cost computational resources and have substantially improved performance compared to existing hydrolysis rate prediction models (HYDROWIN and SPARC).
农药、药品及其他有机污染物进入环境后常会发生水解反应,因此需通过实测或估算得到的水解速率,评估其环境持久性。本研究采用直观的多元线性回归(Multiple Linear Regression, MLR)方法,构建了用于预测羧酸酯(carboxylic acid esters, CAEs)与内酯(lactones)碱催化水解速率常数的稳健定量构效关系(Quantitative Structure-Activity Relationship, QSAR)模型。我们对多种独立描述符组合进行了探索,最终得到4个核心模型(内酯与羧酸酯模型各2个),其中羧酸酯QSAR模型与内酯QSAR模型分别包含15和11个参数。核心显著描述符包括pKₐ、电负性、电荷密度及空间参数。模型性能通过药物理论与化学信息学实验室的DTC-QSAR工具进行评估,结果显示其在内部验证与外部验证中均具备高精度:羧酸酯模型的内部验证决定系数r²=0.93,均方根误差RMSE为0.41~0.43;内酯模型的内部验证r²为0.90~0.93,RMSE为0.38~0.46。外部验证方面,羧酸酯模型的r²=0.93,RMSE为0.43~0.45;内酯模型的r²为0.94~0.98,RMSE为0.33~0.41。所构建的模型仅需低成本计算资源,且相较于现有水解速率预测模型(HYDROWIN与SPARC),性能得到了显著提升。
提供机构:
Taylor & Francis
创建时间:
2023-03-23



