Development and rigorous validation of antimalarial predictive models using machine learning approaches
收藏Taylor & Francis Group2019-10-24 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/Development_and_rigorous_validation_of_antimalarial_predictive_models_using_machine_learning_approaches/8975951/1
下载链接
链接失效反馈官方服务:
资源简介:
The large collection of known and experimentally verified compounds from the ChEMBL database was used to build different classification models for predicting the antimalarial activity against <i>Plasmodium falciparum</i>. Four different machine learning methods, namely the support vector machine (SVM), random forest (RF), k-nearest neighbour (kNN) and XGBoost have been used for the development of models using the diverse antimalarial dataset from ChEMBL. A well-established feature selection framework was used to select the best subset from a larger pool of descriptors. Performance of the models was rigorously evaluated by evaluation of the applicability domain, Y-scrambling and AUC-ROC curve. Additionally, the predictive power of the models was also assessed using probability calibration and predictiveness curves. SVM and XGBoost showed the best performances, yielding an accuracy of ~85% on the independent test set. In term of probability prediction, SVM and XGBoost were well calibrated. Total gain (TG) from the predictiveness curve was more related to SVM (TG = 0.67) and XGBoost (TG = 0.75). These models also predict the high-affinity compounds from PubChem antimalarial bioassay (as external validation) with a high probability score. Our findings suggest that the selected models are robust and can be potentially useful for facilitating the discovery of antimalarial agents.
提供机构:
N. Subbarao; M.Z. Malik; Danishuddin; G. Madhukar
创建时间:
2019-09-05



