Developing a Hybrid Molecular Representation Combining Chemical Structure and MIR Spectral Data: A LogP Prediction Case Study
收藏Figshare2025-11-05 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Developing_a_Hybrid_Molecular_Representation_Combining_Chemical_Structure_and_MIR_Spectral_Data_A_LogP_Prediction_Case_Study/30564096
下载链接
链接失效反馈官方服务:
资源简介:
A novel hybrid molecular fingerprint that integrates chemical structure and mid-infrared (MIR) spectral data into a compact 101-bit binary descriptor was presented. Each bit reflects both the presence of a molecular substructure and a corresponding absorption band within defined MIR regions. This representation was evaluated in a logP prediction task across a data set of 1278 compounds. Support vector regression (SVR) using the hybrid fingerprint achieved an RMSE of 1.443. For comparison, traditional structure-based fingerprints yielded lower RMSEs: 1.056 for Morgan (1024 bits), 0.995 for MACCS (166 bits), and 0.802 for descriptor-based models. Commercial and open-source logP prediction tools also performed better, with RMSEs of 1.090 (SLogP), 1.098 (cLogP), 1.129 (QPLogPo/w), and 1.156 (XLogP3). Despite its modest predictive accuracy, the proposed fingerprint offers a uniquely interpretable and computationally efficient approach, bridging experimental spectral evidence with cheminformatics modeling. This study demonstrates the feasibility of incorporating MIR data into QSAR workflows and lays the foundation for further development of spectrum-informed molecular representations.
创建时间:
2025-11-05



