Predicting Chemical Immunotoxicity through Data-Driven QSAR Modeling of Aryl Hydrocarbon Receptor Agonism and Related Toxicity Mechanisms

Figshare2024-05-28 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/Predicting_Chemical_Immunotoxicity_through_Data-Driven_QSAR_Modeling_of_Aryl_Hydrocarbon_Receptor_Agonism_and_Related_Toxicity_Mechanisms/25917703

下载链接

链接失效反馈

官方服务：

资源简介：

Computational modeling has emerged as a time-saving and cost-effective alternative to traditional animal testing for assessing chemicals for their potential hazards. However, few computational modeling studies for immunotoxicity were reported, with few models available for predicting toxicants due to the lack of training data and the complex mechanisms of immunotoxicity. In this study, we employed a data-driven quantitative structure–activity relationship (QSAR) modeling workflow to extensively enlarge the limited training data by revealing multiple targets involved in immunotoxicity. To this end, a probe data set of 6,341 chemicals was obtained from a high-throughput screening (HTS) assay testing for the activation of the aryl hydrocarbon receptor (AhR) signaling pathway, a key event leading to immunotoxicity. Searching this probe data set against PubChem yielded 3,183 assays with testing results for varying proportions of these 6,341 compounds. 100 assays were selected to develop QSAR models based on their correlations to AhR agonism. Twelve individual QSAR models were built for each assay using combinations of four machine-learning algorithms and three molecular fingerprints. 5-fold cross-validation of the resulting models showed good predictivity (average CCR = 0.73). A total of 20 assays were further selected based on QSAR model performance, and their resulting QSAR models showed good predictivity of potential immunotoxicants from external chemicals. This study provides a computational modeling strategy that can utilize large public toxicity data sets for modeling immunotoxicity and other toxicity endpoints, which have limited training data and complicated toxicity mechanisms.

计算建模已成为评估化学品潜在危害时，替代传统动物实验的省时、经济高效的备选方案。然而，目前针对免疫毒性的计算建模研究尚少，且由于训练数据匮乏与免疫毒性机制复杂，可用于预测有毒物质的模型也极为有限。本研究采用数据驱动的定量构效关系（quantitative structure–activity relationship, QSAR）建模流程，通过挖掘免疫毒性相关的多类作用靶点，大幅扩充了有限的训练数据集。为此，本研究从一项高通量筛选（high-throughput screening, HTS）实验中获取了包含6341种化学品的探针数据集，该实验用于检测可引发免疫毒性的关键事件——芳基烃受体（aryl hydrocarbon receptor, AhR）信号通路的激活情况。将该探针数据集与PubChem数据库进行比对，共得到3183项检测实验，各实验针对这6341种化合物中的不同占比样本提供了检测结果。基于与AhR激动活性的相关性，从中筛选出100项实验用于构建QSAR模型。针对每一项筛选出的实验，本研究结合4种机器学习算法与3种分子指纹，共构建了12个独立的QSAR模型。对所得模型开展的5折交叉验证结果显示其预测性能优异，平均分类正确率（CCR）为0.73。基于QSAR模型的性能表现，本研究进一步筛选出20项实验，其所构建的QSAR模型可有效从外部化学品中预测潜在免疫毒性物质。本研究提出了一种计算建模策略，可利用大规模公开毒性数据集，针对训练数据匮乏、毒性机制复杂的免疫毒性及其他毒性终点开展建模工作。

创建时间：

2024-05-28