five

PTML Model of Enzyme Subclasses for Mining the Proteome of Biofuel Producing Microorganisms

收藏
acs.figshare.com2023-05-30 更新2025-01-22 收录
下载链接:
https://acs.figshare.com/articles/dataset/PTML_Model_of_Enzyme_Subclasses_for_Mining_the_Proteome_of_Biofuel_Producing_Microorganisms/8224058/1
下载链接
链接失效反馈
官方服务:
资源简介:
Predicting enzyme function and enzyme subclasses is always a key objective in fields such as biotechnology, biochemistry, medicinal chemistry, physiology, and so on. The Protein Data Bank (PDB) is the largest information archive of biological macromolecular structures, with more than 150 000 entries for proteins, nucleic acids, and complex assemblies. Among these entries, there are more than 4000 proteins whose functions remain unknown because no detectable homology to proteins whose functions are known has been found. The problem is that our ability to isolate proteins and identify their sequences far exceeds our ability to assign them a defined function. As a result, there is a growing interest in this topic, and several methods have been developed to identify protein function based on these innovative approaches. In this work, we have applied perturbation theory to an original data set consisting of 19 187 enzymes representing all 59 subclasses present in the protein data bank. In addition, we developed a series of artificial neural network models able to predict enzyme–enzyme pairs of query-template sequences with accuracy, specificity, and sensitivity greater than 90% in both training and validation series. As a likely application of this methodology and to further validate our approach, we used our novel model to predict a set of enzymes belonging to the yeast Pichia stipites. This yeast has been widely studied because it is commonly present in nature and produces a high ethanol yield by converting lignocellulosic biomass into bioethanol through the xylose reductase enzyme. Using this premise, we tested our model on 222 enzymes including xylose reductase, that is, the enzyme responsible for the conversion of biomass into bioethanol.

预测酶的功能及其亚类始终是生物技术、生物化学、药物化学、生理学等领域的关键目标。蛋白质数据银行(PDB)是生物大分子结构信息的最主要档案库,包含超过150,000条关于蛋白质、核酸和复合体的条目。在这些条目中,有超过4,000种蛋白质的功能尚不清楚,因为尚未发现与已知功能蛋白质可检测的同源性。问题在于,我们分离蛋白质并识别其序列的能力远超过我们为其分配明确功能的能力。因此,对此领域的研究兴趣日益增长,并已开发出多种基于创新方法来识别蛋白质功能的方法。在本研究中,我们将扰动理论应用于一个包含19,187种酶的原始数据集,这些酶代表了蛋白质数据银行中存在的59个亚类。此外,我们开发了一系列人工神经网络模型,能够以超过90%的准确率、特异性和灵敏度预测查询模板序列的酶-酶对,这在训练和验证系列中均得到了验证。作为该方法可能的应用之一,以及进一步验证我们的方法,我们利用我们新颖的模型预测了一组属于酵母毕赤酵母(Pichia stipites)的酶。这种酵母因其广泛存在于自然界中,并能通过木糖还原酶将木质纤维素生物质转化为生物乙醇,从而产生高乙醇产量而受到广泛研究。基于这一前提,我们在包括木糖还原酶在内的222种酶上测试了我们的模型,木糖还原酶是负责将生物质转化为生物乙醇的酶。
提供机构:
ACS Publications
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作