Prevention of Leakage in Machine Learning Prediction for Polymer Composite Properties
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Prevention_of_Leakage_in_Machine_Learning_Prediction_for_Polymer_Composite_Properties/25657025
下载链接
链接失效反馈官方服务:
资源简介:
Machine
learning (ML) has facilitated property prediction for intricate
materials by integrating materials and experimental features such
as processing and measurement conditions. However, ML models designed
for material properties have often disregarded a common issue of “leakage,”
resulting in an overestimation of model performance and a decrease
in model transferability. This issue can arise from biases inherent
in multiple data points obtained from the same experimental group.
We provide a critical examination and prevention method of leakage
in property prediction for polymer composites. Our proposed method
utilizes data partitioning based on the experimental group to ensure
that data from the same group are not mixed in both the training and
test sets. Evaluation results highlight that the conventional random
partitioning unintentionally inflates ML performance through the misuse
of experimental features for leaking data bias within the same experimental
group rather than explaining the physical causality. In contrast,
the proposed method enables the leakage-free utilization of experimental
features to improve prediction accuracy while ensuring model transferability.
Specifically, when integrating experimental features with polymer
and filler features, the conventional method overestimates the prediction
performance of electrical conductivity in reducing RMSE by 26% depending
on leakage, whereas the proposed method achieves a reduction in RMSE
by 5% without leakage. These findings offer valuable guidance for
the effective utilization of experimental features in data-driven
materials science.
创建时间:
2024-04-20



