five

EviCYP: In Silico Prediction of Cytochrome P450 Substrates Based on Vector Quantization and Evidential Deep Learning

收藏
Figshare2026-03-17 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/EviCYP_In_Silico_Prediction_of_Cytochrome_P450_Substrates_Based_on_Vector_Quantization_and_Evidential_Deep_Learning/31795481
下载链接
链接失效反馈
官方服务:
资源简介:
The accurate identification of cytochrome P450 (CYP) substrates is crucial in drug discovery and safety assessment, as these enzymes mediate the metabolism of most clinical drugs. However, existing computational models are often limited by data quality issues and lack the ability to quantify prediction uncertainty, hindering their reliable application. To address these challenges, we present EviCYP, a novel prediction framework that integrates evidential deep learning with vector quantization (VQ). We first constructed a high-quality data set by curating 4388 substrates and 2880 nonsubstrates from 1629 publications, and supplemented it with 3728 pseudonegative samples, resulting in 10,996 samples spanning nine major CYP isoforms. The EviCYP architecture processes multimodal molecular representations and enzyme sequences through dedicated encoders, compresses features via VQ to reduce redundancy, and employs an evidential layer to output both class probabilities and an uncertainty estimate. On an internal test set, EviCYP achieved an average AUROC of 0.9500. Notably, the model’s uncertainty quantification is highly reliable, with high-uncertainty predictions strongly correlating with classification errors. This work provides a robust and trustworthy computational tool for CYP substrate prediction.
创建时间:
2026-03-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作