five

Table6_A machine learning-based approach to ERα bioactivity and drug ADMET prediction.docx

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Table6_A_machine_learning-based_approach_to_ER_bioactivity_and_drug_ADMET_prediction_docx/21810930
下载链接
链接失效反馈
官方服务:
资源简介:
By predicting ERα bioactivity and mining the potential relationship between Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) attributes in drug research and development, the development efficiency of specific drugs for breast cancer will be effectively improved and the misjudgment rate of R&D personnel will be reduced. The quantitative prediction model of ERα bioactivity and classification prediction model of Absorption, Distribution, Metabolism, Excretion, Toxicity properties were constructed. The prediction results of ERα bioactivity were compared by XGBoot, Light GBM, Random Forest and MLP neural network. Two models with high prediction accuracy were selected and fused to obtain ERα bioactivity prediction model from Mean absolute error (MAE), mean squared error (MSE) and R2. The data were further subjected to model-based feature selection and FDR/FPR-based feature selection, respectively, and the results were placed in a voting machine to obtain Absorption, Distribution, Metabolism, Excretion, Toxicity classification prediction model. In this study, 430 molecular descriptors were removed, and finally 20 molecular descriptors with the most significant effect on biological activity obtained by the dual feature screening combined optimization method were used to establish a compound molecular descriptor prediction model for ERα biological activity, and further classification and prediction of the Absorption, Distribution, Metabolism, Excretion, Toxicity properties of the drugs were made. Eighty variables were selected by the model ExtraTreesClassifier Classifie, and 40 variables were selected by the model GradientBoostingClassifier to complete the model-based feature selection. At the same time, the feature selection method based on FDR/FPR is also selected, and the three classification models obtained by the two methods are placed into the voting machine to obtain the final model. The experimental results showed that the model‘s evaluation indexes and roc diagram were excellent and could accurately predict ERα bioactivity and Absorption, Distribution, Metabolism, Excretion, Toxicity properties. The model constructed in this study has high accuracy, fast convergence and robustness, has a very high accuracy for Absorption, Distribution, Metabolism, Excretion, Toxicity and ERα classification prediction, has bright prospects in the biopharmaceutical field, and is an important method for energy conservation and yield increase in the future.

本研究通过预测雌激素受体α(ERα)生物活性,并挖掘药物研发中吸收、分布、代谢、排泄与毒性(Absorption, Distribution, Metabolism, Excretion, Toxicity,ADMET)属性间的潜在关联,可有效提升乳腺癌特异性药物的研发效率,降低研发人员的误判率。本研究构建了ERα生物活性定量预测模型,以及ADMET属性分类预测模型。本研究采用XGBoost、LightGBM、随机森林(Random Forest)以及多层感知器(MLP)神经网络对ERα生物活性的预测结果进行对比分析。基于平均绝对误差(Mean Absolute Error,MAE)、均方误差(Mean Squared Error,MSE)以及决定系数R²三项指标,筛选并融合两个预测精度较高的模型,最终得到ERα生物活性预测模型。随后分别对数据集开展基于模型的特征筛选与基于错误发现率(False Discovery Rate,FDR)、假阳性率(False Positive Rate,FPR)的特征筛选,并将两种特征筛选结果输入至投票分类器中,最终得到ADMET属性分类预测模型。本研究剔除了430个分子描述符,最终采用经双重特征筛选联合优化方法得到的、对生物活性影响最为显著的20个分子描述符,构建了ERα生物活性复合分子描述符预测模型,并进一步完成了药物ADMET属性的分类预测。基于模型的特征筛选环节中,极端随机树分类器(ExtraTreesClassifier)筛选得到80个变量,梯度提升树分类器(GradientBoostingClassifier)筛选得到40个变量。与此同时,本研究还采用了基于FDR/FPR的特征筛选方法,将两种特征筛选方式得到的三个分类模型输入至投票分类器中,得到最终的ADMET分类预测模型。实验结果表明,本模型的各项评价指标与受试者工作特征曲线(Receiver Operating Characteristic,ROC)均表现优异,可精准预测ERα生物活性与ADMET属性。本研究构建的模型具备预测精度高、收敛速度快、鲁棒性强的优势,对ADMET属性与ERα的分类预测准确率极高,在生物制药领域拥有广阔的应用前景,是未来实现节能增产的重要技术手段。
创建时间:
2023-01-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作