five

Modeling CO2 solubility in polyethylene glycol polymer using data driven methods

收藏
Figshare2025-06-01 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Modeling_CO_sub_2_sub_solubility_in_polyethylene_glycol_polymer_using_data_driven_methods/29207146
下载链接
链接失效反馈
官方服务:
资源简介:
The solubility of CO2 in polyethylene glycol (PEG) polymer is a critical parameter for optimizing its use in various industrial processes, underscoring the need for precise predictive models. In this research, a Random Forest (RF) machine learning model is meticulously tuned through four sophisticated optimization algorithms: Batch Bayesian Optimization (BBO), Self-Adaptive Differential Evolution (SADE), Bayesian Probability Improvement (BPI), and Gaussian Processes Optimization (GPO). The model leverages a dataset of 164 experimental samples, incorporating essential input parameters such as pressure, PEG molar mass, and temperature to predict CO2 solubility. To prevent overfitting, K-fold cross-validation is applied throughout model training. The efficacy of each optimization method is evaluated using computational runtime and performance metrics, including R-squared (R2), mean squared error (MSE), and average absolute relative error (AARE%). Correlation analysis indicates that pressure has a moderate positive relationship with CO2 solubility (correlation coefficient: 0.58), while PEG molar mass and temperature exhibit weaker associations (0.2 and 0.05, respectively). Among the optimization techniques, RF-BPI proves to be the most effective, delivering superior predictive performance. The results show RF-BPI achieves an R2 of 0.9625 for the training set and 0.9307 for the test set, surpassing other methods. For benchmarking, a conventional multiple linear regression (LR) model was tested, demonstrating significantly lower accuracy (R2 notably inferior to RF-BPI’s 0.9307 on the test set). In terms of computational efficiency, BPI records the shortest runtime (63.9 seconds), while SADE is the least efficient, requiring 2370.2 seconds. Sensitivity analysis further clarifies the relative impact of each input variable on CO2 solubility, affirming the power of data-driven approaches in modeling complex systems. The created models can be seen as promising predictive tools to predict CO2 solubility in PEG in the absence of requiring physical measurements that are often labor-intensive, costly, and laborious.
创建时间:
2025-06-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作