five

Cocrystal Formation Prediction: Hybrid GIN-Mordred Model Outperforms DFT-Based Methods

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Cocrystal_Formation_Prediction_Hybrid_GIN-Mordred_Model_Outperforms_DFT-Based_Methods/28685120
下载链接
链接失效反馈
官方服务:
资源简介:
Cocrystals offer significant potential across various industries, especially pharmaceuticals, by addressing the poor solubility of new drug candidates. However, traditional experimental screening for cocrystal formation is expensive and time-consuming, highlighting the need for predictive models. In this study, we compared four cocrystal prediction approaches: two deep learning (DL) models based on DFT-driven data (PointNet for electrostatic potential (ESP) maps and a novel LSTM for sequential hydrogen bond parameters), a novel hybrid model combining graph isomorphism networks (GIN) with Mordred descriptors, and the empirical Hydrogen Bond Energy (HBE) method. To perform this comparison, we compiled and carried out DFT calculations for 14,790 molecules (7395 pairs of successful and unsuccessful cocrystals). Notably, the GIN-Mordred model outperformed all other methods, achieving the highest balanced accuracy (BACC: 0.916), F1-score (0.956), recall (0.932), and AUC (0.97), with superior segregation performance in distinguishing between cocrystallization outcomes. Importantly, the GIN-Mordred model does not require costly DFT calculations, demonstrating that a combination of graph-based and descriptor-based molecular representation provides an efficient and accurate alternative for cocrystal prediction. This model significantly streamlines the process of tuning the physicochemical properties of crystalline materials for various applications.
创建时间:
2025-03-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作