five

Table_1_Development of a TSR-Based Method for Protein 3-D Structural Comparison With Its Applications to Protein Classification and Motif Discovery.DOCX

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_Development_of_a_TSR-Based_Method_for_Protein_3-D_Structural_Comparison_With_Its_Applications_to_Protein_Classification_and_Motif_Discovery_DOCX/13566473
下载链接
链接失效反馈
官方服务:
资源简介:
Development of protein 3-D structural comparison methods is important in understanding protein functions. At the same time, developing such a method is very challenging. In the last 40 years, ever since the development of the first automated structural method, ~200 papers were published using different representations of structures. The existing methods can be divided into five categories: sequence-, distance-, secondary structure-, geometry-based, and network-based structural comparisons. Each has its uniqueness, but also limitations. We have developed a novel method where the 3-D structure of a protein is modeled using the concept of Triangular Spatial Relationship (TSR), where triangles are constructed with the Cα atoms of a protein as vertices. Every triangle is represented using an integer, which we denote as “key,” A key is computed using the length, angle, and vertex labels based on a rule-based formula, which ensures assignment of the same key to identical TSRs across proteins. A structure is thereby represented by a vector of integers. Our method is able to accurately quantify similarity of structure or substructure by matching numbers of identical keys between two proteins. The uniqueness of our method includes: (i) a unique way to represent structures to avoid performing structural superimposition; (ii) use of triangles to represent substructures as it is the simplest primitive to capture shape; (iii) complex structure comparison is achieved by matching integers corresponding to multiple TSRs. Every substructure of one protein is compared to every other substructure in a different protein. The method is used in the studies of proteases and kinases because they play essential roles in cell signaling, and a majority of these constitute drug targets. The new motifs or substructures we identified specifically for proteases and kinases provide a deeper insight into their structural relations. Furthermore, the method provides a unique way to study protein conformational changes. In addition, the results from CATH and SCOP data sets clearly demonstrate that our method can distinguish alpha helices from beta pleated sheets and vice versa. Our method has the potential to be developed into a powerful tool for efficient structure-BLAST search and comparison, just as BLAST is for sequence search and alignment.

蛋白质三维结构比对方法的研发,对于解析蛋白质功能具有重要价值,而此类方法的开发本身却极具挑战性。近四十年来,自首套自动化结构比对方法问世以来,已有约200篇采用不同结构表征策略的相关研究论文发表。现有结构比对方法可划分为五大类别:基于序列、基于距离、基于二级结构、基于几何特征以及基于网络的结构比对方法,各类方法均具备独特优势,但也存在各自的局限性。我们研发了一种全新方法:借助三角空间关系(Triangular Spatial Relationship,TSR)概念对蛋白质三维结构进行建模,即以蛋白质的α碳原子(Cα atoms)作为顶点构建三角形,每个三角形均以整数进行表征,我们将其称为“键值(key)”。键值通过基于规则的公式,结合边长、夹角与顶点标签计算得到,该公式可确保不同蛋白质中完全一致的TSR被赋予相同的键值。由此,蛋白质结构可被表征为整数向量。本方法可通过比对两种蛋白质间相同键值的数量,精准量化结构或子结构的相似性。本方法的独特之处包括:(i)采用独特的结构表征方式,无需执行结构叠合操作;(ii)以三角形表征子结构——三角形是捕捉形状特征最为简单的结构基元;(iii)通过匹配对应多个TSR的整数,实现复杂的结构比对。本方法会将一种蛋白质的每个子结构与另一蛋白质的所有子结构逐一进行比对。鉴于蛋白酶与激酶在细胞信号传导中发挥关键作用,且多数此类蛋白为药物研发靶点,因此本方法被应用于蛋白酶与激酶的相关研究。我们针对蛋白酶和激酶鉴定得到的新型基序或子结构,可进一步加深对其结构关联的理解。此外,本方法为蛋白质构象变化的研究提供了全新路径。CATH与SCOP数据集的测试结果清晰表明,本方法可准确区分α螺旋与β折叠片层,且反之亦可实现精准区分。本方法有望被开发为一款高效的结构-BLAST(Structure-BLAST)比对与搜索工具,正如BLAST之于序列搜索与联配那般。
创建时间:
2021-01-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作