Machine Learning Based Prediction of Enzymatic Degradation of Plastics Using Encoded Protein Sequence and Effective Feature Representation
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning_Based_Prediction_of_Enzymatic_Degradation_of_Plastics_Using_Encoded_Protein_Sequence_and_Effective_Feature_Representation/23541982
下载链接
链接失效反馈官方服务:
资源简介:
Enzyme biocatalysis for plastic treatment and recycling
is an emerging
field of growing interest. However, it is challenging and time-consuming
to identify plastic-degrading enzymes with desirable functionality,
given the large number of putative enzyme sequences. There is a critical
need to develop an effective approach to accurately predict the enzyme
activity in degrading different types of plastics. In this study,
we developed a machine-learning-based plastic enzymatic degradation
(PED) framework to predict the ability of an enzyme to degrade plastics
of interest by exploring and recognizing hidden patterns in protein
sequences. A data set integrating information from a wide range of
experimentally verified enzymes and various common plastic substrates
was created. A new context-aware enzyme sequence representation (CESR)
mechanism was developed to learn the abundant contextual information
in enzyme sequences, and feature extraction was performed for enzymes
at both the amino acid level and global sequence level. Thirteen machine
learning classification algorithms were compared, and XGBoost was
identified as the best-performing algorithm. PED achieved an overall
accuracy of 90.2% and outperformed sequence-based protein classification
models from the existing literature. Furthermore, important enzyme
features in plastic degradation were identified and comprehensively
interpreted. This study demonstrated a new tool for the prediction
and discovery of plastic-degrading enzymes.
创建时间:
2023-06-19



