One Size Fits All? Development of the CPOSS209 Data Set of Experimental and Hypothetical Polymorphs for Testing Computational Modeling Methods
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/One_Size_Fits_All_Development_of_the_CPOSS209_Data_Set_of_Experimental_and_Hypothetical_Polymorphs_for_Testing_Computational_Modeling_Methods/28883045
下载链接
链接失效反馈官方服务:
资源简介:
Organic crystal structure prediction (CSP) studies have
led to
the rapid development of methods for predicting the relative energies
of known and computer-generated crystal structures. There is a compromise
between the level of theoretical treatment, its reliability across
different types of organic systems, how its accuracy depends on the
size and shape of the unit cell, and the size and the number of structures
that can be modeled at an affordable computational cost. We have used
our database of crystal structure prediction studies, often performed
as a complement to experimental screening, to produce sets comprising
6 to 15 crystal structures, covering known polymorphs, observed packings
of closely related molecules, and CSP-generated energetically competitive
but distinct structures, for 20 organic molecules. These have been
chosen to illustrate some of the issues that need consideration in
any lattice energy method, seeking to be generally applicable to moderate-sized
organic molecules, including small drug molecules. We included the
methods of crystallization reported for the experimental polymorphs.
In all of the examples, the original CSP used electronic structure
calculations on the molecule to give the conformational energy and
an anisotropic atom–atom model for the electrostatic intermolecular
energy, combined with an empirical “exp-6” repulsion
dispersion model to give the intermolecular lattice energy. The lattice
energies and structures are compared with those obtained by reoptimizing
with periodic, plane-wave, dispersion-corrected density functional
theory, specifically PBE with the TS dispersion correction, and with
single point energies where the many body dispersion (MBD) dispersion
correction is applied, as an example of a widely used “workhorse”
method. The use of this data set for a preliminary test of modeling
methods is illustrated for two Machine Learned Foundation Models,
MACE-MP-0 and MACE-OFF23. The challenges in modeling the putative
and observed polymorphs for a range of molecules, their energies,
and the possible level of agreement with experimental data are illustrated.
Very similar molecules can differ significantly in the polymorphs
observed, only partially reflecting the range of polymorph screening
experiments used and the energetically competitive structures produced
by CSP approaches based on a purely thermodynamic paradigm.
创建时间:
2025-04-28



