Conformer Generation with OMEGA: Learning from the Data Set and the Analysis of Failures
收藏NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Conformer_Generation_with_OMEGA_Learning_from_the_Data_Set_and_the_Analysis_of_Failures/2465935
下载链接
链接失效反馈官方服务:
资源简介:
We recently published a high quality validation set for
testing
conformer generators, consisting of structures from both the PDB and
the CSD (Hawkins, P. C. D. et al. J. Chem. Inf. Model. 2010, 50, 572.), and tested the performance
of our conformer generator, OMEGA, on these sets. In the present publication,
we focus on understanding the suitability of those data sets for validation
and identifying and learning from OMEGA’s failures. We compare,
for the first time we are aware of, the coverage of the applicable
property spaces between the validation data sets we used and the parent
compound sets to determine if our data sets adequately sample these
property spaces. We also introduce the concept of torsion fingerprinting
and compare this method of dissimilation to the more traditional graph-centric
diversification methods we used in our previous publication. To improve
our ability to programmatically identify cases where the crystallographic
conformation is not well reproduced computationally, we introduce
a new metric to compare conformations, RMSTanimoto. This new metric
is used alongside those from our previous publication to efficiently
identify reproduction failures. We find RMSTanimoto to be particularly
effective in identifying failures for the smallest molecules in our
data sets. Analysis of the nature of these failures, particularly
those for the CSD, sheds further light on the issue of strain in crystallographic
structures. Some of the residual failure cases not resolved by simple
changes in OMEGA’s defaults present significant challenges
to conformer generation engines like OMEGA and are a source of new
avenues to further improve their performance, while others illustrate
the pitfalls of validating against crystallographic ligand conformations,
particularly those from the PDB.
创建时间:
2016-02-20



