Basic Stability Tests of Machine Learning Potentials for Molecular Simulations in Computational Drug Discovery
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Basic_Stability_Tests_of_Machine_Learning_Potentials_for_Molecular_Simulations_in_Computational_Drug_Discovery/29997930
下载链接
链接失效反馈官方服务:
资源简介:
Neural network potentials trained on quantum-mechanical
data can
calculate molecular interactions with relatively high speed and accuracy.
However, not all neural network potentials are suitable for molecular
simulations, as they might exhibit instabilities, nonphysical behavior,
or lack accuracy. To assess the reliability of neural network potentials,
a series of tests is conducted during model training, in the gas phase,
and in the condensed phase. The testing procedure is performed for
eight in-house neural network potentials based on the ANI-2x data
set, using both the ANI-2x and MACE architectures. This consistent
framework allows an evaluation of the effect of the model architecture
on its performance. For comparison, we also perform stability tests
of the publicly available neural network potentials: ANI-2x, ANI-1ccx,
MACE-OFF23, and AIMNet2. The results show that the different models
have different weaknesses. A normal-mode analysis of 14 simple benchmark
molecules with large displacements from the energy minima revealed
that the published MACE-OFF23-S model shows large deviations from
the reference quantum-mechanical energy surface. Also, some MACE models
with a reduced number of parameters failed to produce stable molecular
dynamics simulations in the gas phase, and all MACE models exhibit
unfavorable behavior during steric clashes. In addition, the published
ANI-2x and one of the in-house MACE models are not able to reproduce
the structure of liquid water at ambient conditions, forming an amorphous
solid phase instead. For the ANI-1ccx model, the multibody interactions
in the condensed water phase lead to nonphysical additional energy
minima in bond length and bond angle space, which caused a phase transition
to an amorphous solid. Out of all 13 considered public and in-house
models, only one in-house model based on the ANI-2x B97–3c
data set shows better agreement with the experimental radial distribution
function of water than the simple molecular mechanics TIP3P and OPC
models. Protein–ligand interaction energies for the four benchmark
systems TYK2, CDK2, JNK1, and P38 show that almost all models exhibit
a higher correlation with experimental binding affinities than the
Chemgauss4 docking score (average R2 >
0.16). With an average R2 of 0.43, the
ANI-2x model outperforms molecular mechanics calculations with the
GAFF2 force field and DFTB3 semiempirical calculations (average R2 of 0.39 and 0.38), approaching the accuracy
of absolute binding free energy calculations (average R2 of 0.52). However, the rather mixed results for the
different machine learning potentials show that great care must be
taken during model training and when selecting a neural network potential
for real-world applications.
创建时间:
2025-08-27



