five

rbLEC - restricted backbone Local Euler Characteristic - from CATH database

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8382583
下载链接
链接失效反馈
官方服务:
资源简介:
----------------------------------------------------------------------------------------------------------------------------------- Author: Rodrigo A. Moreira (C) 2023 https://orcid.org/0000-0002-7605-8722 LICENSE: CC BY-NC-ND 4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) ---------------------------------------------------------------------------------------------------------------------------------- rbLEC - Local Euler Charactersitics - from CATH database ---------------------------------------------------------------------------------------------------------------------------------- A. rbLEC NETWORK     [I] The networks for each PDB[1] structure is defined by the PDB atoms N,CA,C of each residue as nodes of a graph G.     [II] An edge of G is set if the distance between two atom in [I] is greater than 2.0 Angstrons.     [III] The graph G is defined in the files with extensions ".network_backboneRE_heavy_gt2" Equation (1) [6,7]     \begin{equation}         \chi = \sum_{k=1}^{N} \kappa_k = \sum_{k=1}^{N} \underbrace{ \left(1 + \sum_{l=1}^{\infty} (-1)^{l} \frac{v_{l-1}}{l+1} \right)_{k}}_{\kappa_k}     \end{equation} Equation (2)     \begin{equation}         LEC = \sum_{m \in R} \kappa_m = \kappa_{N} + \kappa_{CA} + \kappa_{C}     \end{equation} B. FILENAME EXTENSIONS   B.1 Basic files ".fixed"     PDB file after use of pdbfixer[2] in structures from CATH database. ".dssp"     Output of DSSP[3] software ".stride"     Output of STRIDE[4] software   B.2 Data files ".network_backboneRE_heavy_gt2" - Generate by D.2 below.     Describe the network graph, as described in A. above. ".knill_curvature" - Generate by D.1 below.     Contain the filtration of kappas for each vertice of the network. ".residues_curvature"  - Generate by D.1 below.     They are the filtration of LEC, Equation (2) above, for each residue, namely summation of 3 kappas from respective '.knill_curvature', correspoings to PDB atoms N,CA and C, describe in A. above. ".label"  - Generated by D.3 below     Extra file for easier assesment of structures. They have the same information about LEC as described in respective ".residue_curvature" file extensions, but merge also the information from ".dssp" and ".stride" classes as well as residue name and residue ID for each molecule.     Format of columns:         cutoff resname resid DSSP_class STRIDE_class LEC C. FOLDERS     CATH_FIXED (after uncompress cath_fixed.tar.xz, approximately 13GB)         contains the fixed PDBs and LECs from CATH[5] database D. SOFTWARE     D.1 lec.py:  compute the kappas in Equation (1) above.         Example usage:             $ python3 lec.py CATH_FIXED/2x0qA02/2x0qA02         It will create the files with extension ".kappas" and ".relec", which reproduces the respectively the files with extension ".knill_curvature" and ".residue_curvature".      D.2 pdb2network.lua: creates rbLEC network file (number of nodes and edges list) from PDB to be used as input by lec.py.         Example usage:             $ lua pdb2rbLEC.lua CATH_FIXED/2x0qA02/2x0qA02.fixed         Output reproduces the file CATH_FIXED/2x0qA02/2x0qA02.pdb.network_backboneRE_heavy_gt2      D.3 label.lua: create files with extension '*.label' from files '*.pdb.stride', '*.pdb.dssp' and '*.pdb.network_backboneRE_heavy_gt2.residues_curvature.         Example usage:              $ lua label.lua CATH_FIXED/2x0qA02/2x0qA02.pdb         Output reproduces the file CATH_FIXED/2x0qA02/2x0qA02.pdb.network_backboneRE_heavy_gt2.residues_curvature.label REFERENCES [1] Herman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., & Bourne, P. (2000). The protein data bank. Nucleic acids research, 28, 235–42. [2] Eastman, P., Swails, J., Chodera, J., McGibbon, R., Zhao, Y., Beauchamp, K., Wang, L.P., Simmonett, A., Harrigan, M., Stern, C., & others (2017). OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7), e1005659. [3] Kabsch, W., & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers: Original Research on Biomolecules, 22(12), 2577–2637. [4] Frishman, D., & Argos, P. (1995). Knowledge-based protein secondary structure assignment. Proteins: Structure, Function, and Bioinformatics, 23(4), 566–579. [5] Knudsen, M., & Wiuf, C. (2010). The CATH database. Human genomics, 4(3), 1–6. [6] Levitt, N. (1992). The Euler characteristic is the unique locally determined numerical homotopy invariant of finite complexes. Discrete & computational geometry, 7, 59–67. [7] Knill, O. (2011). A graph theoretical Gauss-Bonnet-Chern theorem. arXiv preprint arXiv:1111.5395.
创建时间:
2023-09-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作