five

Project files provided as supporting information to the manuscript "Information-theoretical measures identify accurate low-resolution representations of protein configurational space"

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6554497
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains the following compressed folder: -Notebooks.zip: This folder contains:   -python_script:         -RESREL.py: script performing the clusterization and computing the relevance resolution curves         -random_curves.py: script generating the random value and computing the corresponding RES-REV curves_s         -Cluster_distance_matrix.py: script returning the distance among clusters for a given partition.   -python_notebook:         -Exploratory_analysis.ipynb:  Analysis performed on the 12-protein_dataset         -DMAPS_ANTI.ipynb: Diffusion Map for the Antibody         -DMAPS_COV_1ake.ipynb: Diffusion Map + Inter-Intra state decomposition of covariance for 1ake Packages required for the usage of these python scripts/notebooks:   -numpy   -pandas   -matplotlib   -seaborn   -multiprocessing   -scipy   ======== RAW DATA ======== The raw data produced and employed in this study are available on a Google Drive folder at the following address: https://drive.google.com/drive/folders/1PasAUCgpR5-gdzUVEdyusgZIayQN0Le9 In this folder, together with the compressed Notebooks.zip folder, one can fin the compressed folder Data.zip, within which the following data are present: -12-protein_dataset:     -md.mdp: the .mdp file used in the MD simulations     -PROTEIN_PDB_CODE:         -Hk_{sel}.npy & Hs_{sel}.npy: the Rel & Res curves, sel=[all, CA, CB]         -RMSD_{sel}.npy: the RMSD matrix, sel=[all, CA, CB]         -npt.gro:protein+water+ions structure @TEO the equilibration (NVT+NPT)         -MSR_df.csv: a dataset containing the following columns                   'area' : area behind the Relevance-Resolution curve;                   'selection': the atomic selection (['all', 'CA', 'CB']) used to compute the RMSD matrix used for the clusterization (and consequently the Relevance-Resolution curves)                   'method': the linkage measure used in the clustering procedure, an integer in [0,6];                   'method_name': the linkage measure used in the clustering procedure, a string in ['average','ward','complete','single','centroid','median','weighted'];                   'rmsd_mean': the mean value of the rmsd vector along the trajectory computed wrt the first frame;                   'rmsd_var': the variance of the rmsd vector along the trajectory computed wrt the first frame;                   'rgy_mean': the mean value of the radius of gyration  along the trajectory;                   'rgy_var': the variance of the radius of gyration  along the trajectory;                   'rmsf_mean': the mean value of the rmsf;                   'rmsf_var': the variance of the rmsf;                   'RMSD_M_mean': the mean value of the RMSD matrix.                   'RMSD_M_var': the variance of the RMSD matrix.   -Random:     -curves.npy= 100K Relevance-Resolution Random curves for M=40001     -curves_s.npy= 100K Relevance-Resolution Random curves for M=15000   -validation_dataset:     -antibody:         -Hk_CB.npy & Hs_CB.npy: the Rel & Res curves         -RMSD_CB.npy: the RMSD matrix         -DIFF_{M}.npy: the eigenvalue/vector of the 10-D diffusion space         -Label_{method}.npy: the label vector for n_clusters     -1ake:         -Hk_{sel}.npy & Hs_{sel}.npy: the Rel & Res curves         -RMSD_{sel}.npy: the RMSD matrix         -DIFF_{M}.npy: the eigenvalue/vector of the 10-D diffusion space         -Label_{method}.npy: the label vector for n_clusters         -intra_{m}.npy: the intra-cluster covariance matrix         -inter_cov_{m}.npy: the inter-cluster correlation matrix   NOTE ===== The matrices of the cluster distances for adenylate kinase and antibody have been computed through the script Cluster_distance_matrix.py. These matrices have not been included in the dataset because of their large size; the raw data are however available upon request.
创建时间:
2022-05-17
二维码
社区交流群
二维码
科研交流群
商业服务