Machine learning-driven bioavailability prediction in early-stage drug development: a KNIME-based computational workflow for digital health applications

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://figshare.com/articles/dataset/Machine_learning-driven_bioavailability_prediction_in_early-stage_drug_development_a_KNIME-based_computational_workflow_for_digital_health_applications/29167006

下载链接

链接失效反馈

官方服务：

资源简介：

Bioavailability prediction remains a significant challenge in early-stage drug development, where conventional experimental approaches are time-consuming and resource-intensive. This study explores the application of machine learning techniques to enhance the efficiency of bioavailability prediction. By leveraging computational workflows within the KNIME Analytics Platform, we aim to automate bioavailability assessment and reduce dependence on costly in vitro and in vivo studies. A dataset comprising 475 drug-like compounds characterised by key molecular descriptors was analysed using multiple machine learning models, including Random Forest, Gradient Boosting, Decision Trees, k-Nearest Neighbours, and neural networks. Model performance was assessed through 5-fold cross-validation, with ensemble models outperforming linear and neural network-based approaches. Random Forest demonstrated the highest predictive performance (R2 = 0.87, RMSE = 0.08). Feature importance analysis identified topological polar surface area and solubility as the most influential factors in bioavailability prediction. The findings underscore the potential of integrating open-source tools and machine learning methodologies in pharmaceutical research, improving workflow efficiency while adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. This approach facilitates rapid and cost-effective bioavailability assessment, supporting AI-driven predictive modelling and digital health applications in drug development.

创建时间：

2025-05-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集