five

Table_10_Development of a TSR-Based Method for Protein 3-D Structural Comparison With Its Applications to Protein Classification and Motif Discovery.XLSX

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Table_10_Development_of_a_TSR-Based_Method_for_Protein_3-D_Structural_Comparison_With_Its_Applications_to_Protein_Classification_and_Motif_Discovery_XLSX/13566476
下载链接
链接失效反馈
官方服务:
资源简介:
Development of protein 3-D structural comparison methods is important in understanding protein functions. At the same time, developing such a method is very challenging. In the last 40 years, ever since the development of the first automated structural method, ~200 papers were published using different representations of structures. The existing methods can be divided into five categories: sequence-, distance-, secondary structure-, geometry-based, and network-based structural comparisons. Each has its uniqueness, but also limitations. We have developed a novel method where the 3-D structure of a protein is modeled using the concept of Triangular Spatial Relationship (TSR), where triangles are constructed with the Cα atoms of a protein as vertices. Every triangle is represented using an integer, which we denote as “key,” A key is computed using the length, angle, and vertex labels based on a rule-based formula, which ensures assignment of the same key to identical TSRs across proteins. A structure is thereby represented by a vector of integers. Our method is able to accurately quantify similarity of structure or substructure by matching numbers of identical keys between two proteins. The uniqueness of our method includes: (i) a unique way to represent structures to avoid performing structural superimposition; (ii) use of triangles to represent substructures as it is the simplest primitive to capture shape; (iii) complex structure comparison is achieved by matching integers corresponding to multiple TSRs. Every substructure of one protein is compared to every other substructure in a different protein. The method is used in the studies of proteases and kinases because they play essential roles in cell signaling, and a majority of these constitute drug targets. The new motifs or substructures we identified specifically for proteases and kinases provide a deeper insight into their structural relations. Furthermore, the method provides a unique way to study protein conformational changes. In addition, the results from CATH and SCOP data sets clearly demonstrate that our method can distinguish alpha helices from beta pleated sheets and vice versa. Our method has the potential to be developed into a powerful tool for efficient structure-BLAST search and comparison, just as BLAST is for sequence search and alignment.

蛋白质三维结构比对方法的开发,对于解析蛋白质功能具有重要意义,而此类方法的研发亦极具挑战性。自首个自动化结构比对方法问世以来的40年间,已有约200篇基于不同结构表征方式的相关论文发表。现有结构比对方法可分为五大类:基于序列、基于距离、基于二级结构、基于几何特征以及基于网络的结构比对方法,各类方法均有其独特性,但也存在各自的局限性。本研究提出了一种全新方法:以三角空间关系(Triangular Spatial Relationship, TSR)为核心对蛋白质三维结构进行建模,即以蛋白质的α碳原子(Cα atom)作为顶点构建三角形。每个三角形均用一个整数表征,我们将其命名为"键值(key)"。键值通过基于规则的公式,结合边长、夹角与顶点标签计算得到,可确保不同蛋白质中完全一致的TSR被赋予相同的键值。由此,蛋白质结构可被表征为整数向量。本方法可通过比对两个蛋白质间相同键值的数量,精准量化其整体结构或子结构的相似性。本方法的独特性体现在三方面:(i) 采用独特的结构表征方式,无需进行结构叠合;(ii) 以三角形作为子结构的表征单元,因其是捕捉空间形状最基础的几何原型;(iii) 通过匹配对应多个TSR的整数,即可完成复杂的结构比对。本方法会将一个蛋白质的所有子结构与另一蛋白质的全部子结构逐一进行比对。本方法已应用于蛋白酶与激酶的相关研究:这两类蛋白质在细胞信号传导中发挥关键作用,且多数可作为药物研发靶点。我们针对蛋白酶与激酶鉴定得到的全新基序或子结构,可进一步深化对其结构关联的认知。此外,本方法为研究蛋白质构象变化提供了独特路径。此外,基于CATH与SCOP数据集的测试结果清晰表明,本方法可有效区分α螺旋与β折叠片,反之亦然。本方法有望发展为一款高效的structure-BLAST搜索与比对工具,正如BLAST在序列搜索与联配领域的应用一样。
创建时间:
2021-01-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作