five

Data for "Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-generated Protein-Ligand Structures: Towards Per-target Scoring Functions"

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7383353
下载链接
链接失效反馈
官方服务:
资源简介:
Data used in "Machine Learning Scoring Functions for Drug Discovery from Experimental and Computer-generated Protein-Ligand Structures: Towards Per-target Scoring Functions" by F. Pellicani, D. Dal Ben, A. Perali, S. Pilati If you use these data or the python script for your research or other activities, please cite the corresponding journal article.   ==================== Uncompressing the zipped file DataSFUnicam.zip provies the following files and folders: DataSFUnicam/       ExperimentalDataPDBFiles/     This folder contains 2408 .pdb files of experimental complex structures. The files are named with a univocal code corresponding to the protein-ligand complex.       ExperimentalDataXLSXFile.xlsx     This Excel file reports the experimental protein-ligand chemical information. In the sheet named “Foglio1”, the first column contains the univocal code of the protein-ligand complex, the second column contains the experimentally measured pK_d.       SyntheticDataPDBFiles/     This folder contains the .pdb files of the synthetic complex structures. The .pdb files are grouped in 17 folders according to just as many target proteins. The folders are named after the corresponding protein. Each folder contains the .pdb files for the best position of each protein-ligand pair according to the MOE docking score. The files are named with a univocal code.       SyntheticDataXLSXFiles/     The folder contains 17 Excel files with the chemical information of the synthetic protein-ligand complexes. The files are named after the corresponding target protein. In the sheet named “Foglio1” of each .xlsx file, the first column contains a univocal code of the protein-ligand complex in each conformation, the second column contains an auxiliary numerical code corresponding to the protein-ligand pair, the third column contains the experimentally measured pK_i, and the fourth column contains the docking score provided by the MOE software. ==================== USER GUIDE FOR THE PYTHON SCRIPT Download and uncompress the zipped file "SFUnicam.zip" with a command like "unzip SFUnicam.zip".  The following file structure is created: SFUnicam/         ComplexToBePredictedFolder/4ey5_30.pdb          MaxAssMatrix.npy         my_model         devStndSynt.npy         mediaSynt.npy         UnicamSF13prot.py         README.txt          The subfolder "ComplexToBePredictedFolder/" contains the example PDB file "4ey5_30.pdb". -) To execute the script "UnicamSF13prot.py", Python 3 should be installed with the following libraries and sublibraries: Keras:       Regularizers       Sequential (keras.models)       Conv1D, Dense, MaxPooling1D, GlobalMaxPooling1D, GlobalAveragePooling1D, AveragePooling1D (keras.layers)       Adam (keras.optimizers) Numpy Tensorflow Operation: -) Copy the .pdb file related to the protein-ligand complex whose affinity is to be predicted in the subfolder “ComplexToBePredictedFolder/”. -) Make sure the following files are in the same folder where the python script is: MaxAssMatrix.npy mediaSynt.npy devStndSynt.npy my_model -) Run the code using Python 3 with a command like "python3.x UnicamSF13prot.py". -) Enter the name of the protein-ligand PDB file whose affinity is to be predicted (excluding the extension ".pdb"). -) Read the predicted affinity from screen.
创建时间:
2023-01-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作