five

Assessment values of machine learning models.

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Assessment_values_of_machine_learning_models_/30014351
下载链接
链接失效反馈
官方服务:
资源简介:
Aqueous solubility, an essential physical property of compounds, has significant applications across various fields. However, verifying the solubility of compounds through experimental methods often requires substantial human and material resources. To address this issue, this study introduces the StackBoost model for predicting the solubility of organic compounds and systematically compares it with five well-known ensemble learning algorithms: Adaptive Boosting (AdaBoost), Gradient Boosted Regression Trees (GBRT), Light Gradient Boosting Machine (LGBM), Extreme Gradient Boosting (XGBoost), and Random Forest (RF). The prediction results indicate that the StackBoost model excels in predicting aqueous solubility, achieving a coefficient of determination () of 0.90, a root mean square error (RMSE) of 0.29, and a mean absolute error (MAE) of 0.22, significantly outperforming the other comparative models. Furthermore, this study further conducted high-throughput screening on large-scale datasets and successfully identified compounds with high potential for water solubility. Additionally, the model’s generalization ability is verified through transfer learning. Although the performance of the StackBoost model decreases when applied to different datasets, it still shows considerable transferability, making it a more generalizable prediction model for aqueous solubility.
创建时间:
2025-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作