five

Band Gap and Reorganization Energy Prediction of Conducting Polymers by the Integration of Machine Learning and Density Functional Theory

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Band_Gap_and_Reorganization_Energy_Prediction_of_Conducting_Polymers_by_the_Integration_of_Machine_Learning_and_Density_Functional_Theory/29163365
下载链接
链接失效反馈
官方服务:
资源简介:
The performance and reliability of machine learning (ML)-quantitative structure–property relationship (QSPR) models depend on the quality, size, and diversity of the data set used for model training. In this study, we manually curated a large-scale data set containing 3120 donor–acceptor (D–A) conjugated polymers (CPs) by selecting the most utilized 60 donors and 52 acceptors. This data set serves as a valuable resource for ML-based prediction of key electronic properties such as band gap energy (Eg) and hole reorganization energy (λh), calculated using density functional theory (DFT) to advance organic photovoltaics (OPV). Beyond data set construction, we systematically investigated how different descriptor and fingerprint types impact performance of the ML model. Recognizing that not all features contributed equally to the model performance, we conducted an in-depth analysis to identify the most informative descriptors for the fundamental optoelectronic properties. Our findings show that kernel partial least-squares (KPLS) regression utilizing radial and molprint2D fingerprints achieved the highest accuracy in predicting Eg, with R2 values of 0.899 and 0.897, respectively. For λh prediction, models integrating electronic descriptors such as frontier orbital energy levels significantly improved performance, achieving an R2 value of 0.830. This study provides a comprehensive investigation of how different descriptors influence model performance in OPV research. By analyzing why certain models succeed while others fail, our findings offer insight into feature selection and data set optimization for accurate target property prediction in organic electronics. The developed ML models provide a predictive framework for high-performance OPV materials design, significantly reducing the reliance on labor-intensive experimental procedures and computationally expensive first-principle calculations.
创建时间:
2025-05-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作