five

Developing a Hybrid Molecular Representation Combining Chemical Structure and MIR Spectral Data: A LogP Prediction Case Study

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Developing_a_Hybrid_Molecular_Representation_Combining_Chemical_Structure_and_MIR_Spectral_Data_A_LogP_Prediction_Case_Study/30564096
下载链接
链接失效反馈
官方服务:
资源简介:
A novel hybrid molecular fingerprint that integrates chemical structure and mid-infrared (MIR) spectral data into a compact 101-bit binary descriptor was presented. Each bit reflects both the presence of a molecular substructure and a corresponding absorption band within defined MIR regions. This representation was evaluated in a logP prediction task across a data set of 1278 compounds. Support vector regression (SVR) using the hybrid fingerprint achieved an RMSE of 1.443. For comparison, traditional structure-based fingerprints yielded lower RMSEs: 1.056 for Morgan (1024 bits), 0.995 for MACCS (166 bits), and 0.802 for descriptor-based models. Commercial and open-source logP prediction tools also performed better, with RMSEs of 1.090 (SLogP), 1.098 (cLogP), 1.129 (QPLogPo/w), and 1.156 (XLogP3). Despite its modest predictive accuracy, the proposed fingerprint offers a uniquely interpretable and computationally efficient approach, bridging experimental spectral evidence with cheminformatics modeling. This study demonstrates the feasibility of incorporating MIR data into QSAR workflows and lays the foundation for further development of spectrum-informed molecular representations.
创建时间:
2025-11-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作