Neural Mulliken Analysis: Molecular Graphs from Density Matrices for QSPR on Raw Quantum-Chemical Data
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Neural_Mulliken_Analysis_Molecular_Graphs_from_Density_Matrices_for_QSPR_on_Raw_Quantum-Chemical_Data/29423149
下载链接
链接失效反馈官方服务:
资源简介:
Here, molecular graphs derived from the one-electron
density matrix
are introduced within a more general effort to explore whether incorporating
electronic structure awareness allows a single model to both better
generalize from small data and better learn molecular encodings. Diagonal
density matrix blocks serve as atomic node embeddings, while off-diagonal
blocks provide embeddings for “link” nodes related to atomic pairs. In a minimal basis, these embeddings
have dimensions of only 45 and 81, yet no information is lost and
the original density matrix can be fully reconstructed. Blocks from
the basis set overlap matrix are used as edge embeddings to encode
structural information and as weights for message aggregation operations.
Additionally, element-wise multiplication performed during aggregation
may provide access to electronic charges, analogous to Mulliken population
analysis. The proposed concept was evaluated using data from the First
and Second Solubility Challenges (Llinàs et al. J.Chem.
Inf. Model. 2008, 48, 1289–1303;
Llinàs and Avdeef J. Chem. Inf. Model. 2019, 59, 3036–3040). A graph neural
network (GNN) trained on sets of 94 and 1000 drug-like molecules achieved
improved solubility prediction accuracy (RMSE 0.63, R2 0.79 in SC-1 and RMSE of 0.83 and 0.92, R2 of 0.57 and 0.79 on the “tight” and “loose”
SC-2 test sets, respectively). If combined with existing techniques
for predicting electron density from molecular structures, this approach
is promising for addressing a range of chemical machine-learning problems.
创建时间:
2025-06-27



