Toward Accurate PAH IR Spectra Prediction: Handling Charge Effects with Classical and Deep Learning Models
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Toward_Accurate_PAH_IR_Spectra_Prediction_Handling_Charge_Effects_with_Classical_and_Deep_Learning_Models/28965459
下载链接
链接失效反馈官方服务:
资源简介:
Polycyclic aromatic hydrocarbons (PAHs) play a crucial
role in
astrochemistry, environmental studies, and combustion chemistry, yet
interpreting their infrared (IR) spectra remains challenging due to
the similarity of spectral features of many molecules. The presumable
presence of both neutral and charged PAHs in mixtures complicates
spectra interpretation, too. While first-principle calculations provide
accurate spectral predictions, their high computational cost limits
scalability. This study employs machine learning (ML) to predict PAH
IR spectra, emphasizing the applicability of the developed models
simultaneously for neutral and ionized molecules. Two models are introduced:
an XGBoost model trained on Morgan fingerprints and a graph neural
network (GNN) that employs molecular graph representations. Molecular
charges are treated by incorporating their one-hot or learnable NN
encodings to molecular representations. Both models demonstrate excellent
predictive capabilities, for the first time enabling fast and accurate
prediction of charged PAHs IR spectra. While the XGBoost model demonstrates
the highest accuracy achieved to date, the GNN shows significant
promise for future advancements due to the inherent capabilities of
molecular graph representations. Remaining challenges, such as the
scarcity of data on heteroatomic PAHs, and potential approaches of
addressing them are also discussed in the manuscript.
创建时间:
2025-05-08



