SCScore: Synthetic Complexity Learned from a Reaction Corpus
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/SCScore_Synthetic_Complexity_Learned_from_a_Reaction_Corpus/5826108
下载链接
链接失效反馈官方服务:
资源简介:
Several definitions
of molecular complexity exist to facilitate
prioritization of lead compounds, to identify diversity-inducing and
complexifying reactions, and to guide retrosynthetic searches. In
this work, we focus on synthetic complexity and reformalize its definition
to correlate with the expected number of reaction steps required to
produce a target molecule, with implicit knowledge about what compounds
are reasonable starting materials. We train a neural network model
on 12 million reactions from the Reaxys database to impose a pairwise
inequality constraint enforcing the premise of this definition: that
on average, the products of published chemical reactions should be
more synthetically complex than their corresponding reactants. The
learned metric (SCScore) exhibits highly desirable nonlinear behavior,
particularly in recognizing increases in synthetic complexity throughout
a number of linear synthetic routes.
目前已有多种分子复杂度(molecular complexity)的定义,用于助力先导化合物的优先筛选、识别可诱导分子多样性与提升复杂度的化学反应,以及指导逆合成检索。本研究聚焦于合成复杂度(synthetic complexity),并重新形式化其定义,使其与合成目标分子所需的预期反应步数相关联,同时暗含了"何为合理起始原料"的认知。我们基于Reaxys数据库中的1200万条化学反应训练了一款神经网络模型,以施加成对不等式约束来强化该定义的核心前提:平均而言,已发表化学反应的产物应比其对应的反应物具有更高的合成复杂度。该学习得到的度量指标SCScore展现出极具理想特性的非线性行为,尤其在识别多条线性合成路线中合成复杂度的提升方面表现出色。
创建时间:
2018-01-25



