Dataset of the study "Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction"
收藏DataCite Commons2025-04-01 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/Dataset_of_the_study_Insights_into_an_original_pocket_ligand_pair_classification_a_promising_tool_for_ligand_profile_prediction_/662781/1
下载链接
链接失效反馈官方服务:
资源简介:
In this webpage, we stock two files : <strong> * </strong><strong><em>DataDescription.csv<br></em></strong> * <em><strong>descriptor_values.csv </strong></em> These two files<strong> contain the dataset of the study entitled "</strong>Insights on an original pocket-ligand pair classification: a promising tool for ligand profile prediction" written by : S. Pérot, L. Regad, C. Reynès, O. Sperandio, M.A. Miteva, B.O. Villoutreix, A.C. Camproux (Affiliations : Univ. Paris Diderot, Sorbonne Paris Cité, INSERM UMRS 973, MTi, F-75205 Paris, France) <strong>Abstract of the study :</strong><br>This study presents a multivariate approach relating ligand properties with protein pockets properties from the analysis of known ligand-protein interactions. We explore and optimize the pocket-ligand pairs space by combining pocket and ligand descriptors using Principal Component Analysis and developing a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physicochemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physicochemical properties and capture relevant information with respect to protein-ligand interactions. <strong><br></strong> <strong>Dataset Description:</strong> <strong><em>file : DataDescription.csv</em></strong><br>This study is based on a dataset composed of 483 pocket-ligand pairs. To create a large training set of pocket-ligand pairs, we initially gathered the refined set from the PDBbind database (Wang et al., 2004; 2005) and the Astex test set (Hartshorn et al., 2007) and selected complexes that contained drug-like ligands (i.e. small chemical compounds). Two protein-ligand complex datasets were compiled for the training set. The first one is composed of 560 non-redundant protein-ligand structures, with a resolution better than 2.5 Å retrieved from the refined set of the PDBbind database. From these structures, we removed those containing metal ions or cofactors next to the co-crystallized ligand resulting in a selection of 432 structures. The second one is composed of 85 manually curated protein-ligand complexes from the Astex test set. As for the previous set, we removed the structures with some ions or cofactors next to the ligand, resulting in a selection of 51 structures. The resulting dataset corresponds at the end to 483 protein-ligand structures. The Id and information about each pair are available in the following table <em><strong>DataDescription.csv</strong></em><br>This table contains: * PDB code of the complex * Protein chains containing in the PDB files. If the complex contains several protein chains, their chain Id are separated by "/" * Amino acid (AA) sequence of each chain. AA sequences correspond to sequence of the crystallized proteins. If the complex contains several protein chains, their AA sequences are separated by "/" * UniProt Id of each chain. If the complex contains several protein chains, their UniProt Id are separated by "/". NoId means no UniProt Id was find for a given chainSmile code of the interested ligandPDB code of the interested ligandObtained cluster of the pocket-ligand pairs. <em><strong>file : descriptor_values.csv </strong></em> This file is a table containing the 24 pocket and ligand descriptors and the cluster assignation of each pocket-ligand pair (last column). pocket descriptors : Based on the current literature, we developed some tools/scripts or used available packages to compute the following standard pocket descriptors on the binding cavities. <strong>pocket_volume</strong> : volume of the pocket estimated using Chimera software (Sanner et al. 1996) <strong>protomol_polarity_ratio</strong> : polarity ratio of the pocket (Eyrisch and Helms, 2007). It ranges from 0 (not polar) to 1 (polar). <strong>pocket_rugosity</strong> : pocket rugosity (Pettit and Bowie, 1999). Roughness represents how rough a pocket is: a high value induces a rough pocket. <strong>pocket_planarity</strong> : pocket planarity (Sugaya ans Ikeda, 2009). The planarity ranges from 0 (concave) to 1 (flat). <strong>pocket_narrowness</strong> : pocket narrowness (Sugaya ans Ikeda, 2009). The narrowness ranges from 0 (full circle) to 1 (line). <strong>pocket_lambda0,pocket_lambda2</strong> : The three moments of inertia correspond to the eigenvalues of the inertia matrix computed on the pocket. The moments of inertia of a virtual pocket with regards to a given axis describe how many probes the pocket has overall and how far each probe is from the axis. Consequently the closest the moments of inertia are one from another, the more spherical the pocket is. And conversely the more lambda0 is different from lambda2, the more cylindrical the pocket tends to be. <strong>pocket_hbond_acceptor</strong> : number of hydrogen-bond acceptors of the pocket (Schalon et al., 2008) <strong>pocket_hba.pour</strong> : % of hydrogen-bond acceptors of the pocket (Schalon et al., 2008) <strong>pocket_hbond_donor</strong> : number of hydrogen-bond donor of the pocket (Schalon et al., 2008) <strong>pocket_hbd.pour</strong> : % of hydrogen-bond donor of the pocket (Schalon et al., 2008) <strong>pocket_charge</strong> : pocket charge is computed as the difference between the number of positively charged amino acids and the number of negatively charged ligand descriptors :<br>The ligand descriptors also computed on pockets were computed as described in pocket descriptors section while the remaining ones were computed using the software FAF-Drugs2 (Lagorce et al., 2008). <strong>ligand_volume</strong> : ligand volume <strong>ligand_polarity_ratio</strong> : ligand polarity <strong>RotatableB</strong> : number of rotable bonds of the ligand <strong>rot.pour</strong> : % of rotatable bonds of the ligand <strong>LogP</strong> : LogP of the ligand <strong>HBA</strong> : number of hydrogen-bond acceptors of the ligand <strong>HBD</strong> : number of hydrogen-bond donors of the ligand <strong>hbd.pour</strong> : % of hydrogen-bond donors of the ligand <strong>PSA</strong> : polar surface area of the ligand <strong>Charge</strong> : ligand charge <strong>ligand_lambda0, ligand_lambda2</strong> : first and third moments of inertia of the ligand
提供机构:
figshare
创建时间:
2016-01-11



