Using Stacking Ensemble Machine Learning to Estimate the Human Half-Life and Apparent Volume of Distribution: Implications for Human Health Risk Assessment
收藏Figshare2025-11-10 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Using_Stacking_Ensemble_Machine_Learning_to_Estimate_the_Human_Half-Life_and_Apparent_Volume_of_Distribution_Implications_for_Human_Health_Risk_Assessment/30586620
下载链接
链接失效反馈官方服务:
资源简介:
Evaluating the population pharmacokinetic parameters, biological half-life (HL), and apparent volume of distribution (Vd) is important for identifying potential risks of chemicals. In this study, we developed a framework of stacking machine learning models for predicting the two parameters, providing more generalized prediction methods for data from diverse sources. We built a larger database containing experimental data for 2934 and 1787 substances for HL and Vd, respectively, and considered two different chemical featurization methods. We employed five individual algorithms (Support Vector Regression, Random Forest, Gaussian Process, Artificial Neural Network, and Extreme Gradient Boosting) to construct the base models, and then combined predictions using Multiple Linear Regression to obtain 4 stacking models. Our stacking models performed well and outperformed the corresponding base models, with the extended connectivity fingerprint-based stacking model achieving the best predictive performance. The accuracy of the models, as defined by the applicability domain, was further improved, retaining more than 60% of the test data. Finally, we developed a publicly accessible online Web site (http://tkpara.hhra.net), where users can easily and quickly utilize our models. Our work provides data support for human health risk assessment of chemicals and for the use and management of chemicals or industrial products.
创建时间:
2025-11-10



