Substructure Mining Using Elaborate Chemical Representation
收藏NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://figshare.com/articles/dataset/Substructure_Mining_Using_Elaborate_Chemical_Representation/3231058
下载链接
链接失效反馈官方服务:
资源简介:
Substructure mining algorithms are important drug discovery tools since they can find substructures that
affect physicochemical and biological properties. Current methods, however, only consider a part of all
chemical information that is present within a data set of compounds. Therefore, the overall aim of our study
was to enable more exhaustive data mining by designing methods that detect all substructures of any size,
shape, and level of chemical detail. A means of chemical representation was developed that uses atomic
hierarchies, thus enabling substructure mining to consider general and/or highly specific features. As a proof-of-concept, the efficient, multipurpose graph mining system Gaston learned substructures of any size and
shape from a mutagenicity data set that was represented in this manner. From these substructures, we extracted
a set of only six nonredundant, discriminative substructures that represent relevant biochemical knowledge.
Our results demonstrate the individual and synergistic importance of elaborate chemical representation and
mining for nonlinear substructures. We conclude that the combination of elaborate chemical representation
and Gaston provides an excellent method for 2D substructure mining as this recipe systematically explores
all substructures in different levels of chemical detail.
创建时间:
2016-05-05



