Prediction of Enzyme Classification using Protein Sequence Embeddings. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects
收藏DataCite Commons2026-04-17 更新2025-04-16 收录
下载链接:
https://library.ucsd.edu/dc/object/bb34560073
下载链接
链接失效反馈官方服务:
资源简介:
Biologists work with a multitude of protein sequences represented by strings of letters. The amino acid sequence of these proteins allows us to leverage various machine learning Natural Language Processing algorithms aimed to predict enzyme classifications which are indicative of both protein structure and functionality. Our goal is to propose a multi level classification solution that is designed to predict the respective class of a given enzyme. Our approach consists of predicting the classification of an enzyme by applying NLP to a protein sequence. Our method utilizes BERT (Bidirectional Encoder Representations from Transformers) models to create embeddings, or feature vectors, and a variety of machine learning models to predict the respective class and subclass of an enzyme.
提供机构:
UC San Diego Library Digital Collections
创建时间:
2021-12-15



