five

Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/Sequence-Based_Prediction_of_Cysteine_Reactivity_Using_Machine_Learning/5539969
下载链接
链接失效反馈
官方服务:
资源简介:
As one of the most intrinsically reactive amino acids, cysteine carries a variety of important biochemical functions, including catalysis and redox regulation. Discovery and characterization of cysteines with heightened reactivity will help annotate protein functions. Chemical proteomic methods have been used to quantitatively profile cysteine reactivity in native proteomes, showing a strong correlation between the chemical reactivity of a cysteine and its functionality; however, the relationship between the cysteine reactivity and its local sequence has not yet been systematically explored. Herein, we report a machine learning method, sbPCR (sequence-based prediction of cysteine reactivity), which combines the basic local alignment search tool, truncated composition of k-spaced amino acid pair analysis, and support vector machine to predict cysteines with hyper-reactivity based on only local sequence features. Using a benchmark set compiled from hyper-reactive cysteines in human proteomes, our method can achieve a prediction accuracy of 98%, a precision of 95%, and a recall ratio of 89%. We utilized these governing features of local sequence motifs to expand the prediction to potential hyper-reactive cysteines in other proteomes deposited in the UniProt database. We validated our predictions in Escherichia coli by activity-based protein profiling and discovered a hyper-reactive cysteine from a functionally uncharacterized protein, YecH. Biochemical analysis suggests that the hyper-reactive cysteine might be involved in metal binding. Our computational method provides a large inventory of potential hyper-reactive cysteines in proteomes and is highly complementary to other experimental approaches to guide systematic annotation of protein functions in the postgenome era.
创建时间:
2017-10-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作