five

CovCysPredictor: Predicting Selective Covalently Modifiable Cysteines Using Protein Structure and Interpretable Machine Learning

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/CovCysPredictor_Predicting_Selective_Covalently_Modifiable_Cysteines_Using_Protein_Structure_and_Interpretable_Machine_Learning/28171543
下载链接
链接失效反馈
官方服务:
资源简介:
Targeted covalent inhibition is a powerful therapeutic modality in the drug discoverer’s toolbox. Recent advances in covalent drug discovery, in particular, targeting cysteines, have led to significant breakthroughs for traditionally challenging targets such as mutant KRAS, which is implicated in diverse human cancers. However, identifying cysteines for targeted covalent inhibition is a difficult task, as experimental and in silico tools have shown limited accuracy. Using the recently released CovPDB and CovBinderInPDB databases, we have trained and tested interpretable machine learning (ML) models to identify cysteines that are liable to be covalently modified (i.e., “ligandable” cysteines). We explored myriad physicochemical features (pKa, solvent exposure, residue electrostatics, etc.) and protein–ligand pocket descriptors in our ML models. Our final logistic regression model achieved a median F1 score of 0.73 on held-out test sets. When tested on a small sample of holo proteins, our model also showed reasonable performance, accurately predicting the most ligandable cysteine in most cases. Taken together, these results indicate that we can accurately predict potential ligandable cysteines for targeted covalent drug discovery, privileging cysteines that are more likely to be selective rather than purely reactive. We release this tool to the scientific community as CovCysPredictor.
创建时间:
2025-01-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作