Data-Driven Approaches Can Overcome the Cost–Accuracy Trade-Off in Multireference Diagnostics
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Data-Driven_Approaches_Can_Overcome_the_Cost_Accuracy_Trade-Off_in_Multireference_Diagnostics/12594202
下载链接
链接失效反馈官方服务:
资源简介:
High-throughput
computational screening typically employs methods
(i.e., density functional theory or DFT) that can fail to describe
challenging molecules, such as those with strongly correlated electronic
structure. In such cases, multireference (MR) correlated wavefunction
theory (WFT) would be the appropriate choice but remains more challenging
to carry out and automate than single-reference (SR) WFT or DFT. Numerous
diagnostics have been proposed for identifying when MR character is
likely to have an effect on the predictive power of SR calculations,
but conflicting conclusions about diagnostic performance have been
reached on small data sets. We compute 15 MR diagnostics, ranging
from affordable DFT-based to more costly MR-WFT-based diagnostics,
on a set of 3165 equilibrium and distorted small organic molecules
containing up to six heavy atoms. Conflicting MR character assignments
and low pairwise linear correlations among diagnostics are also observed
over this set. We evaluate the ability of existing diagnostics to
predict the percent recovery of the correlation energy, %Ecorr. None of the DFT-based diagnostics are nearly as
predictive of %Ecorr as the best WFT-based
diagnostics. To overcome the limitation of this cost–accuracy
trade-off, we develop machine learning (ML, i.e., kernel ridge regression)
models to predict WFT-based diagnostics from a combination of DFT-based
diagnostics and a new, size-independent 3D geometric representation.
The ML-predicted diagnostics correlate as well with MR effects as
their computed (i.e., with WFT) values, significantly improving over
the DFT-based diagnostics on which the models were trained. These
ML models thus provide a promising approach to improve upon DFT-based
diagnostic accuracy while remaining suitably low cost for high-throughput
screening.
创建时间:
2020-06-14



