Machine Learning and Large Language Models for Modeling Complex Toxicity Pathways and Predicting Steroidogenesis
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning_and_Large_Language_Models_for_Modeling_Complex_Toxicity_Pathways_and_Predicting_Steroidogenesis/29425378
下载链接
链接失效反馈官方服务:
资源简介:
High-throughput
screening and computational models have
been effective
in predicting chemical interactions with estrogen and androgen receptors,
but similar approaches for steroidogenesis remain limited. To address
this gap, we developed general steroidogenesis modulation models using
data from ∼1,800 chemicals screened in H295R human adrenocortical
carcinoma cells. A random forest model was validated using a prospective
test set of 20 compounds (14 predicted active, 6 inactive), achieving
80% accuracy with conformal prediction adjustments. In parallel, we
built classification and regression models based on IC50 data from ChEMBL for key steroidogenic enzymes, including CYP17A1,
CYP21A2, CYP11B1, CYP11B2, 17β-HSD (1/2/3/5), 5α-reductase
(1/2), and CYP19A1 (126–9,327 compounds per target). These
models enable predictions of both general steroidogenesis inhibition
and potential molecular targets. Additionally, we developed a transformer-based
model (MolBART) to predict all end points simultaneously and validated
this performance. Combined, these models may offer a rapid and scalable
system for assessing chemical impacts on steroidogenesis, supporting
chemical risk assessment, product stewardship, and regulatory decision-making.
创建时间:
2025-06-27



