Predicting Extraction Selectivity of Acetic Acid in Pervaporation by Machine Learning Models with Data Leakage Management
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Predicting_Extraction_Selectivity_of_Acetic_Acid_in_Pervaporation_by_Machine_Learning_Models_with_Data_Leakage_Management/22344187
下载链接
链接失效反馈官方服务:
资源简介:
The extraction of acetic acid and other carboxylic acids
from water
is an emerging separation need as they are increasingly produced from
waste organics and CO2 during carbon valorization. However,
the traditional experimental approach can be slow and expensive, and
machine learning (ML) may provide new insights and guidance in membrane
development for organic acid extraction. In this study, we collected
extensive literature data and developed the first ML models for predicting
separation factors between acetic acid and water in pervaporation
with polymers’ properties, membrane morphology, fabrication
parameters, and operating conditions. Importantly, we assessed seed
randomness and data leakage problems during model development, which
have been overlooked in ML studies but will result in over-optimistic
results and misinterpreted variable importance. With proper data leakage
management, we established a robust model and achieved a root-mean-square
error of 0.515 using the CatBoost regression model. In addition, the
prediction model was interpreted to elucidate the variables’
importance, where the mass ratio was the topmost significant variable
in predicting separation factors. In addition, polymers’ concentration
and membranes’ effective area contributed to information leakage.
These results demonstrate ML models’ advances in membrane design
and fabrication and the importance of vigorous model validation.
创建时间:
2023-03-27



