Toward Metabolic Similarity in Read-Across: A Case Study Using Graph Convolutional Networks to Predict Genotoxicity Outcomes from Simulated Metabolic Networks
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Toward_Metabolic_Similarity_in_Read-Across_A_Case_Study_Using_Graph_Convolutional_Networks_to_Predict_Genotoxicity_Outcomes_from_Simulated_Metabolic_Networks/29163386
下载链接
链接失效反馈官方服务:
资源简介:
Metabolic similarity is a key consideration in evaluating
candidate
source analogues for read-across (RAx), but approaches to systematically
characterize metabolism for read-across prediction are still evolving.
Metabolic similarity is multifaceted, considering the similarity of
the metabolic tree, the metabolites simulated, and the transformation
pathways. The structure of metabolic trees lends itself naturally
to graph representations, for which several methods, including graph
convolutional networks (GCNs), can be applied to quantify the pairwise
similarity between the target and source analogue(s) within an analogue
or category approach. In this study, we compared metabolic graph representations
of metabolites with structural similarities in predicting genotoxicity
outcomes using a data set comprising 5403 chemicals. Xenobiotic metabolism
pathways were predicted using the rat liver models within the commercial
expert system, TIssue MEtabolism Simulator (TIMES), and the phase
I and II xenobiotic metabolism modules within the freely available
system BioTransformer. Metabolic pathways were converted to graphs
and used to train GCNs, generating embeddings for each chemical. The
classification performance of generalized read-across (GenRA), random
forest (RF), logistic regression (LR), and multilayer perceptron (MLP)
was compared using GCN-derived embeddings versus both Morgan and MACCS
chemical fingerprints to identify genotoxic chemicals. GCN embeddings
with LR, based on in vivo TIMES metabolism predictions using MACCS
fingerprints as node features, achieved the highest area under the
curve of the receiver operating characteristic of 0.807, outperforming
GenRA and LR with MACCS fingerprints by 14.47% and 5.49%, respectively.
Our findings suggest that GCN embeddings of predicted metabolism pathways
perform substantially better than structural features of the parent
chemicals in predicting genotoxicity outcomes. Such GCN embeddings
offer new avenues of systematically encoding end point metabolic information
to facilitate analogue identification for read-across.
创建时间:
2025-05-28



