five

Machine Learning Based Prediction of Enzymatic Degradation of Plastics Using Encoded Protein Sequence and Effective Feature Representation

收藏
Figshare2023-06-19 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Machine_Learning_Based_Prediction_of_Enzymatic_Degradation_of_Plastics_Using_Encoded_Protein_Sequence_and_Effective_Feature_Representation/23541985
下载链接
链接失效反馈
官方服务:
资源简介:
Enzyme biocatalysis for plastic treatment and recycling is an emerging field of growing interest. However, it is challenging and time-consuming to identify plastic-degrading enzymes with desirable functionality, given the large number of putative enzyme sequences. There is a critical need to develop an effective approach to accurately predict the enzyme activity in degrading different types of plastics. In this study, we developed a machine-learning-based plastic enzymatic degradation (PED) framework to predict the ability of an enzyme to degrade plastics of interest by exploring and recognizing hidden patterns in protein sequences. A data set integrating information from a wide range of experimentally verified enzymes and various common plastic substrates was created. A new context-aware enzyme sequence representation (CESR) mechanism was developed to learn the abundant contextual information in enzyme sequences, and feature extraction was performed for enzymes at both the amino acid level and global sequence level. Thirteen machine learning classification algorithms were compared, and XGBoost was identified as the best-performing algorithm. PED achieved an overall accuracy of 90.2% and outperformed sequence-based protein classification models from the existing literature. Furthermore, important enzyme features in plastic degradation were identified and comprehensively interpreted. This study demonstrated a new tool for the prediction and discovery of plastic-degrading enzymes.
创建时间:
2023-06-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作