Machine Learning Based Quantitative Structure–Dissolution Profile Relationship
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning_Based_Quantitative_Structure_Dissolution_Profile_Relationship/29244880
下载链接
链接失效反馈官方服务:
资源简介:
Determining accurate drug dissolution processes in the
gastrointestinal
tract is critical in drug discovery as dissolution profiles provide
essential information for estimating the bioavailability of orally
administered drugs. While various methods have been developed to predict
drug solubility based on chemical structures, no reliable tools currently
exist for predicting the dissolution rate constant. This study presents
a novel two-stage machine learning approach, termed Machine Learning
based Quantitative Structure–Dissolution Profile Relationship,
which integrates physics-informed neural networks (PINNs) and deep
neural networks (DNNs) to predict drug dissolution profiles in water,
with varying concentrations of surfactant Sodium Lauryl Sulfate. In
the first stage, PINNs extract key dissolution parametersnamely
the dissolution rate constant (k) and the dissolved
mass fraction at saturation (ϕs)from
existing dissolution data. By leveraging a physical law governing
the dissolution process, PINNs aim to enhance prediction performance
and reduce data requirements. Assuming first-order kinetics of the
drug dissolution process as described by the Noyes–Whitney
equation, PINNs, with 8 hidden layers and 40 neurons per layer, may
outperform traditional nonlinear regression by effectively filtering
noise and focusing on physically meaningful data. In the second stage,
these extracted parameters (k and ϕs) are used to train a DNN to predict dissolution
profiles based on the drug’s chemical structure and dissolution
medium. Using the FDA-recommended metrics: the difference and similarity
factors (f1 and f2), the DNNwith 128 neurons in two hidden layers and
a learning rate of 10–2.8achieved an average
testing accuracy of 61.7% at an 80:20 train-to-test split. Although
this current accuracy is below the generally acceptable range of 70–80%,
this approach shows significant potential as a low-cost, time-efficient
tool for early phase drug formulation. Future improvements are expected
as data quality and diversity increase.
创建时间:
2025-06-05



