Leveraging Machine Learning for Thermoelectric Material Design: Addressing Composition–Property Relations and Data Imbalance Challenges
收藏Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Leveraging_Machine_Learning_for_Thermoelectric_Material_Design_Addressing_Composition_Property_Relations_and_Data_Imbalance_Challenges/30519047
下载链接
链接失效反馈官方服务:
资源简介:
Thermoelectric (TE) technology has emerged as a promising and sustainable solution to address the growing global energy demand. While machine learning accelerates the discovery of high-performance thermoelectric materials, its effectiveness is frequently hampered by data imbalance and quality issues. This study addresses these challenges by utilizing a highly imbalanced data set comprising Germanium Telluride (GeTe) materials, including both pure GeTe and its doped or alloyed variants. A classification model was developed based on four key descriptors, such as temperature, Seebeck coefficient, electronegativity, and electron affinity, to categorize samples into low, medium, and high figure of merit (ZT) classes. To mitigate the effects of class imbalance, an ensemble learning approach was combined with the Adaptive Synthetic Sampling (ADASYN) oversampling technique. Among the models evaluated, the XGBoost classifier demonstrated superior performance, achieving a macro-average precision of 0.94, recall of 0.95, F1-score of 0.94, and an overall accuracy of 94%, making it the most effective model for identifying high-performance TE materials under imbalanced conditions. The XGBoost regression model performed well with an R2 of 0.97 and an RMSE of 0.07, allowing for effective screening of materials with high ZT values. To improve model interpretability, SHAP (SHapley Additive exPlanations) analysis was conducted, which revealed that temperature is the most significant factor for predicting the figure of merit. This work provides a solid and interpretable framework for accelerating the discovery of thermoelectric materials for next-generation energy conversion technologies.



