Table_7_Development of a TSR-Based Method for Protein 3-D Structural Comparison With Its Applications to Protein Classification and Motif Discovery.XLSX
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Table_7_Development_of_a_TSR-Based_Method_for_Protein_3-D_Structural_Comparison_With_Its_Applications_to_Protein_Classification_and_Motif_Discovery_XLSX/13566509
下载链接
链接失效反馈官方服务:
资源简介:
Development of protein 3-D structural comparison methods is important in understanding protein functions. At the same time, developing such a method is very challenging. In the last 40 years, ever since the development of the first automated structural method, ~200 papers were published using different representations of structures. The existing methods can be divided into five categories: sequence-, distance-, secondary structure-, geometry-based, and network-based structural comparisons. Each has its uniqueness, but also limitations. We have developed a novel method where the 3-D structure of a protein is modeled using the concept of Triangular Spatial Relationship (TSR), where triangles are constructed with the Cα atoms of a protein as vertices. Every triangle is represented using an integer, which we denote as “key,” A key is computed using the length, angle, and vertex labels based on a rule-based formula, which ensures assignment of the same key to identical TSRs across proteins. A structure is thereby represented by a vector of integers. Our method is able to accurately quantify similarity of structure or substructure by matching numbers of identical keys between two proteins. The uniqueness of our method includes: (i) a unique way to represent structures to avoid performing structural superimposition; (ii) use of triangles to represent substructures as it is the simplest primitive to capture shape; (iii) complex structure comparison is achieved by matching integers corresponding to multiple TSRs. Every substructure of one protein is compared to every other substructure in a different protein. The method is used in the studies of proteases and kinases because they play essential roles in cell signaling, and a majority of these constitute drug targets. The new motifs or substructures we identified specifically for proteases and kinases provide a deeper insight into their structural relations. Furthermore, the method provides a unique way to study protein conformational changes. In addition, the results from CATH and SCOP data sets clearly demonstrate that our method can distinguish alpha helices from beta pleated sheets and vice versa. Our method has the potential to be developed into a powerful tool for efficient structure-BLAST search and comparison, just as BLAST is for sequence search and alignment.
蛋白质三维结构比对方法的开发,对于解析蛋白质功能具有重要意义,但此类方法的研发同样极具挑战性。自首个自动化结构比对方法问世以来的40年间,已有约200篇基于不同结构表征方式的相关研究论文发表。现有结构比对方法可分为五大类:基于序列、基于距离、基于二级结构、基于几何特征以及基于网络的结构比对方法,各类方法均有其独特性,但也存在各自的局限。本研究提出了一种全新的蛋白质结构表征方法:以三角空间关系(Triangular Spatial Relationship, TSR)为核心概念,以蛋白质的α碳原子(Cα原子)作为顶点构建三角形,以此对蛋白质三维结构进行建模。每个三角形均用一个整数表征,我们将该整数称为“键值(key)”。键值通过基于规则的公式,结合三角形的边长、内角以及顶点标签计算得到,可确保不同蛋白质中完全一致的TSR被分配到相同的键值。由此,蛋白质结构可被表征为整数向量。本方法通过比对两个蛋白质间相同键值的数量,可精准量化整体结构或子结构的相似性。本方法的独特优势包括:(1)采用独特的结构表征方式,无需进行结构叠合操作;(2)以三角形作为子结构的表征单元——三角形是捕捉空间形状最基础的几何基元;(3)通过匹配对应多个TSR的整数,即可实现复杂的结构比对。该方法会将一个蛋白质的所有子结构,与另一蛋白质的所有子结构逐一进行比对。本方法已应用于蛋白酶与激酶的相关研究:这类蛋白质在细胞信号通路中发挥核心作用,且绝大多数均可作为药物研发的靶点。我们针对蛋白酶和激酶所鉴定出的全新模体(motif)与子结构,可进一步加深对其结构关联的理解。此外,本方法为蛋白质构象变化的研究提供了全新的思路。另外,基于CATH与SCOP数据集的测试结果清晰表明,本方法可准确区分α螺旋与β折叠片,反之亦然。本方法有望被开发为一款高效的结构-BLAST比对搜索工具,正如BLAST在序列搜索与联配领域的应用价值一样。
创建时间:
2021-01-13



