Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules
收藏NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/Uniting_Cheminformatics_and_Chemical_Theory_To_Predict_the_Intrinsic_Aqueous_Solubility_of_Crystalline_Druglike_Molecules/2030508
下载链接
链接失效反馈官方服务:
资源简介:
We
present four models of solution free-energy prediction for druglike
molecules utilizing cheminformatics descriptors and theoretically
calculated thermodynamic values. We make predictions of solution free
energy using physics-based theory alone and using machine learning/quantitative
structure–property relationship (QSPR) models. We also develop
machine learning models where the theoretical energies and cheminformatics
descriptors are used as combined input. These models are used to predict
solvation free energy. While direct theoretical calculation does not
give accurate results in this approach, machine learning is able to
give predictions with a root mean squared error (RMSE) of ∼1.1
log S units in a 10-fold cross-validation for our
Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules.
We find that a model built using energy terms from our theoretical
methodology as descriptors is marginally less predictive than one
built on Chemistry Development Kit (CDK) descriptors. Combining both
sets of descriptors allows a further but very modest improvement in
the predictions. However, in some cases, this is a statistically significant
enhancement. These results suggest that there is little complementarity
between the chemical information provided by these two sets of descriptors,
despite their different sources and methods of calculation. Our machine
learning models are also able to predict the well-known Solubility
Challenge dataset with an RMSE value of 0.9–1.0 log S units.
创建时间:
2015-12-17



