Prediction of the Blood–Brain Barrier (BBB) Permeability of Chemicals Based on Machine-Learning and Ensemble Methods
收藏Figshare2021-05-28 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Prediction_of_the_Blood_Brain_Barrier_BBB_Permeability_of_Chemicals_Based_on_Machine-Learning_and_Ensemble_Methods/14695616
下载链接
链接失效反馈官方服务:
资源简介:
The ability of chemicals to enter the blood–brain barrier (BBB) is a key factor for central nervous system (CNS) drug development. Although many models for BBB permeability prediction have been developed, they have insufficient accuracy (ACC) and sensitivity (SEN). To improve performance, ensemble models were built to predict the BBB permeability of compounds. In this study, in silico ensemble-learning models were developed using 3 machine-learning algorithms and 9 molecular fingerprints from 1757 chemicals (integrated from 2 published data sets) to predict BBB permeability. The best prediction performance of the base classifier models was achieved by a prediction model based on an random forest (RF) and a MACCS molecular fingerprint with an ACC of 0.910, an area under the receiver-operating characteristic (ROC) curve (AUC) of 0.957, a SEN of 0.927, and a specificity of 0.867 in 5-fold cross-validation. The prediction performance of the ensemble models is better than that of most of the base classifiers. The final ensemble model has also demonstrated good accuracy for an external validation and can be used for the early screening of CNS drugs.
化学物质穿透血脑屏障(blood–brain barrier, BBB)的能力,是中枢神经系统(central nervous system, CNS)药物研发的关键影响因素。尽管目前已开发出多种血脑屏障通透性预测模型,但此类模型的准确率(ACC)与灵敏度(SEN)仍存在不足。为提升预测性能,本研究构建集成模型以预测化合物的血脑屏障通透性。
本研究整合2个已发表数据集的1757种化学物质数据,结合3种机器学习算法与9种分子指纹,开发了计算机模拟(in silico)集成学习预测模型用于血脑屏障通透性预测。在五折交叉验证中,基于随机森林(random forest, RF)与MACCS分子指纹构建的基础分类器模型取得了最优预测性能:准确率达0.910、受试者工作特征(receiver-operating characteristic, ROC)曲线下面积(AUC)为0.957、灵敏度为0.927、特异度为0.867。
集成模型的预测性能优于多数基础分类器。最终构建的集成模型在外部验证中同样表现出良好的预测准确率,可用于中枢神经系统药物的早期筛选。
创建时间:
2021-05-28



