Developing a Hybrid Molecular Representation Combining Chemical Structure and MIR Spectral Data: A LogP Prediction Case Study
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Developing_a_Hybrid_Molecular_Representation_Combining_Chemical_Structure_and_MIR_Spectral_Data_A_LogP_Prediction_Case_Study/30564096
下载链接
链接失效反馈官方服务:
资源简介:
A novel hybrid molecular fingerprint
that integrates chemical structure
and mid-infrared (MIR) spectral data into a compact 101-bit binary
descriptor was presented. Each bit reflects both the presence of a
molecular substructure and a corresponding absorption band within
defined MIR regions. This representation was evaluated in a logP prediction
task across a data set of 1278 compounds. Support vector regression
(SVR) using the hybrid fingerprint achieved an RMSE of 1.443. For
comparison, traditional structure-based fingerprints yielded lower
RMSEs: 1.056 for Morgan (1024 bits), 0.995 for MACCS (166 bits), and
0.802 for descriptor-based models. Commercial and open-source logP
prediction tools also performed better, with RMSEs of 1.090 (SLogP),
1.098 (cLogP), 1.129 (QPLogPo/w), and 1.156 (XLogP3). Despite its
modest predictive accuracy, the proposed fingerprint offers a uniquely
interpretable and computationally efficient approach, bridging experimental
spectral evidence with cheminformatics modeling. This study demonstrates
the feasibility of incorporating MIR data into QSAR workflows and
lays the foundation for further development of spectrum-informed molecular
representations.
创建时间:
2025-11-05



