Machine Learning Based Quantitative Structure–Dissolution Profile Relationship

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://figshare.com/articles/dataset/Machine_Learning_Based_Quantitative_Structure_Dissolution_Profile_Relationship/29244880

下载链接

链接失效反馈

官方服务：

资源简介：

Determining accurate drug dissolution processes in the gastrointestinal tract is critical in drug discovery as dissolution profiles provide essential information for estimating the bioavailability of orally administered drugs. While various methods have been developed to predict drug solubility based on chemical structures, no reliable tools currently exist for predicting the dissolution rate constant. This study presents a novel two-stage machine learning approach, termed Machine Learning based Quantitative Structure–Dissolution Profile Relationship, which integrates physics-informed neural networks (PINNs) and deep neural networks (DNNs) to predict drug dissolution profiles in water, with varying concentrations of surfactant Sodium Lauryl Sulfate. In the first stage, PINNs extract key dissolution parametersnamely the dissolution rate constant (k) and the dissolved mass fraction at saturation (ϕs)from existing dissolution data. By leveraging a physical law governing the dissolution process, PINNs aim to enhance prediction performance and reduce data requirements. Assuming first-order kinetics of the drug dissolution process as described by the Noyes–Whitney equation, PINNs, with 8 hidden layers and 40 neurons per layer, may outperform traditional nonlinear regression by effectively filtering noise and focusing on physically meaningful data. In the second stage, these extracted parameters (k and ϕs) are used to train a DNN to predict dissolution profiles based on the drug’s chemical structure and dissolution medium. Using the FDA-recommended metrics: the difference and similarity factors (f1 and f2), the DNNwith 128 neurons in two hidden layers and a learning rate of 10–2.8achieved an average testing accuracy of 61.7% at an 80:20 train-to-test split. Although this current accuracy is below the generally acceptable range of 70–80%, this approach shows significant potential as a low-cost, time-efficient tool for early phase drug formulation. Future improvements are expected as data quality and diversity increase.

创建时间：

2025-06-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集