Machine Learning C–N Couplings: Obstacles for a General-Purpose Reaction Yield Prediction
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning_C_N_Couplings_Obstacles_for_a_General-Purpose_Reaction_Yield_Prediction/21865002
下载链接
链接失效反馈官方服务:
资源简介:
Pd-catalyzed C–N couplings are commonplace in
academia and
industry. Despite their significance, finding suitable reaction conditions
leading to a high yield, for instance, remains a challenging and time-consuming
task which usually requires screening over many sets of conditions.
To help select promising reaction conditions in the vast space of
reagent combinations, machine learning is an emerging technique with
a lot of promise. In this work, we assess whether the reaction yield
of C–N couplings can be predicted from databases of chemical
reactions. We test the generalizability of models both on challenging
data splits and on a dedicated experimental test set. We find that,
provided the chemical space represented by the training set is not
left, the models perform well. However, the applicability domain is
quickly left even for simple reactions of the same type, as, for instance,
present in our plate test set. The results show that yield prediction
for new reactions is possible from the algorithmic side but in practice
is hindered by the available data. Most importantly, more data that
cover the diversity in reagents are needed for a general-purpose prediction
of reaction yields. Our findings also expose a challenge to this field
in that it appears to be extremely deceiving to judge models based
on literature data with test sets which are split off the same literature
data, even when challenging splits are considered.
创建时间:
2023-01-11



