Modeling a Crowdsourced Definition of Molecular Complexity
收藏NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/Modeling_a_Crowdsourced_Definition_of_Molecular_Complexity/2280664
下载链接
链接失效反馈官方服务:
资源简介:
This
paper brings together the concepts of molecular complexity and crowdsourcing.
An exercise was done at Merck where 386 chemists voted on the molecular
complexity (on a scale of 1–5) of 2681 molecules taken from
various sources: public, licensed, and in-house. The meanComplexity
of a molecule is the average over all votes for that molecule. As
long as enough votes are cast per molecule, we find meanComplexity
is quite easy to model with QSAR methods using only a handful of physical
descriptors (e.g., number of chiral centers, number of unique topological
torsions, a Wiener index, etc.). The high level of self-consistency
of the model (cross-validated R2 ∼0.88) is remarkable
given that our chemists do not agree with each other strongly about
the complexity of any given molecule. Thus, the power of crowdsourcing
is clearly demonstrated in this case. The meanComplexity appears to
be correlated with at least one metric of synthetic complexity from
the literature derived in a different way and is correlated with values
of process mass intensity (PMI) from the literature and from in-house
studies. Complexity can be used to differentiate between in-house
programs and to follow a program over time.
创建时间:
2014-06-23



