Band Gap and Reorganization Energy Prediction of Conducting Polymers by the Integration of Machine Learning and Density Functional Theory
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Band_Gap_and_Reorganization_Energy_Prediction_of_Conducting_Polymers_by_the_Integration_of_Machine_Learning_and_Density_Functional_Theory/29163362
下载链接
链接失效反馈官方服务:
资源简介:
The performance and
reliability of machine learning (ML)-quantitative
structure–property relationship (QSPR) models depend on the
quality, size, and diversity of the data set used for model training.
In this study, we manually curated a large-scale data set containing
3120 donor–acceptor (D–A) conjugated polymers (CPs)
by selecting the most utilized 60 donors and 52 acceptors. This data
set serves as a valuable resource for ML-based prediction of key electronic
properties such as band gap energy (Eg) and hole reorganization energy (λh), calculated
using density functional theory (DFT) to advance organic photovoltaics
(OPV). Beyond data set construction, we systematically investigated
how different descriptor and fingerprint types impact performance
of the ML model. Recognizing that not all features contributed equally
to the model performance, we conducted an in-depth analysis to identify
the most informative descriptors for the fundamental optoelectronic
properties. Our findings show that kernel partial least-squares (KPLS)
regression utilizing radial and molprint2D fingerprints achieved the
highest accuracy in predicting Eg, with R2 values of 0.899 and 0.897, respectively. For
λh prediction, models integrating electronic descriptors
such as frontier orbital energy levels significantly improved performance,
achieving an R2 value of 0.830. This study
provides a comprehensive investigation of how different descriptors
influence model performance in OPV research. By analyzing why certain
models succeed while others fail, our findings offer insight into
feature selection and data set optimization for accurate target property
prediction in organic electronics. The developed ML models provide
a predictive framework for high-performance OPV materials design,
significantly reducing the reliance on labor-intensive experimental
procedures and computationally expensive first-principle calculations.
创建时间:
2025-05-28



