Training and testing sample data used in "Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction"
收藏DataCite Commons2020-09-01 更新2024-07-25 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Training_and_testing_sample_data_used_in_Exploring_the_potential_of_3D_Zernike_descriptors_and_SVM_for_protein-protein_interface_prediction_/5354293
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the generated training and testing samples used in the paper "Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction". <br>3D Zernike descriptors are used to represent protein surface shape, while Support-Vector Machine (SVM) is the binary classification technique employed to produce a model based on the training data which predicts the class labels of the test data given only the feature vectors of the test data.<br>Data are archived in the format .tar.xz, which can be extracted by common archive utilities. Each archive file contains a number of text files in the SVM light format. Each line records 1331 colon-separated pairs of numbers (the first one being a feature index - an integer ranging from 1 to 1331, the second a floating point number). .csv file are also provided recording the precision, recall and accuracy values for test predictions.<br>The first one or two letters of the archive filename together with the "l" or "r" letters identify the protein class; the "train" keyword identifies training sets, while the "test" keyword identifies testing sets; training sets are available as either balanced or unbalanced. See the associated paper for further details.<br>Background: In the related paper we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein-Protein Docking Benchmark 5.0.<br>
提供机构:
figshare
创建时间:
2017-09-01



