Carcinogenicity Prediction of Noncongeneric Chemicals by a Support Vector Machine
收藏NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/Carcinogenicity_Prediction_of_Noncongeneric_Chemicals_by_a_Support_Vector_Machine/2413669
下载链接
链接失效反馈官方服务:
资源简介:
The ability to identify carcinogenic
compounds is of fundamental
importance to the safe application of chemicals. In this study, we
generated an array of in silico models allowing the
classification of compounds into carcinogenic and noncarcinogenic
agents based on a data set of 852 noncongeneric chemicals collected
from the Carcinogenic Potency Database (CPDBAS). Twenty-four molecular
descriptors were selected by Pearson correlation, F-score, and stepwise
regression analysis. These descriptors cover a range of physicochemical
properties, including electrophilicity, geometry, molecular weight,
size, and solubility. The descriptor mutagenic showed
the highest correlation coefficient with carcinogenicity. On the basis
of these descriptors, a support vector machine-based (SVM) classification
model was developed and fine-tuned by a 10-fold cross-validation approach.
Both the SVM model (Model A1) and the best model from the 10-fold
cross-validation (Model B3) runs gave good results on the test set
with prediction accuracy over 80%, sensitivity over 76%, and specificity
over 82%. In addition, extended connectivity fingerprints (ECFPs)
and the Toxtree software were used to analyze the functional groups
and substructures linked to carcinogenicity. It was found that the
results of both methods are in good agreement.
创建时间:
2016-02-19



