Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Prediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Ensemble_Models_Based_on_QuBiLS-MAS_Features_and_Shallow_Learning_for_the_Prediction_of_Drug-Induced_Liver_Toxicity_Improving_Deep_Learning_and_Traditional_Approaches/12301907
下载链接
链接失效反馈官方服务:
资源简介:
Drug-induced liver
injury (DILI) is a key safety issue in the drug
discovery pipeline and a regulatory concern. Thus, many in
silico tools have been proposed to improve the hepatotoxicity
prediction of organic-type chemicals. Here, classifiers for the prediction
of DILI were developed by using QuBiLS-MAS 0–2.5D molecular
descriptors and shallow machine learning techniques, on a training
set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the
learning series, namely, the following: accuracy = 0.840, sensibility
= 0.890, specificity = 0.761, Matthew’s correlation coefficient
= 0.660, and area under the ROC curve = 0.904. The model was also
satisfactorily evaluated with Y-scrambling test,
and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external
validation was also carried out by using two test sets and five external
test sets, with an average accuracy value equal to 0.854 (±0.062)
and a coverage equal to 98.4% according to its applicability domain.
A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor
Software, Deep Learning DILIserver, and Vslead) reported in the literature,
was also performed. In general, E13 presented the best
global performance in all experiments. The sum of the ranking differences
procedure provided a very similar grouping pattern to that of the
M-ANOVA statistical analysis, where E13 was identified
as the best model for DILI predictions. A noncommercial and fully
cross-platform software for the DILI prediction was also developed,
which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for
the screening of seven data sets, containing natural products, leads,
toxic materials, and FDA approved drugs, to assess the usefulness
of the QSAR models in the DILI labeling of organic substances; it
was found that 50–92% of the evaluated molecules are positive-DILI
compounds. All in all, it can be stated that the E13 model
is a relevant method for the prediction of DILI risk in humans, as
it shows the best results among all of the methods analyzed.
创建时间:
2020-05-14



