Benchmark Dataset for Structure Refinement Methods of Protein Complex Models
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/5026935
下载链接
链接失效反馈官方服务:
资源简介:
This is the dataset used in our work entitled " Benchmarking of Structure Refinement Methods for Protein Complex Model " by Jacob Verburgt and Daisuke Kihara, which is under review.
ZDOCK Derived Benchmark Dataset:
The primary benchmark set used in was directly derived from the ZDOCK Benchmark set. The ZDOCK set contains four structures per target: An unbound ligand, an unbound receptor, a bound ligand, and a bound receptor. The benchmark set is available in such a way where the coordinates of the bound subunits are oriented identical to their complex structure, and the unbound subunits are superimposed onto their respective bound subunits. Our dataset creates the optimially oriented "unbound" complexes by combining the superimposed and unbound subunits, along with removal of waters, ligands, and other non-protein atoms. These unbound complexes are saved in the dataset in the form "XXXX_c_u.pdb", where XXXX is the PDB ID.
From the complete ZDOCK Benchmark of 230 targets, 18 targets were removed due to containing multiple ligand chains, which is incompatible with the standard ligand to receptor model used within CAPRI. The ZDOCK PDB ID’s of these targets are 1AKJ", "1BJ1", "1DE4", "1EER", "1EXB", "1EZU", "1GP2", "1I9R", "1JMO", "1K74", "1N2C", "1QFW", "2HMI", "3EO1", "3HMX", "4FQI", "4GXU", and "9QFW".
There are an additional 8 targets where the superimpostion of the ligand and receptor structures onto the complex led to entanglement of the chains and were subsequently removed from the dataset. The ZDOCK PDB IDs for these targets are "1BGX", "1H1V", "1IRA", "1R8S", "1Y64", "2OT3", "3AAD", "4GAM".
Note:
In the work, we also used CAPRI scoring model dataset derived from CAPRI rounds 38-45. This dataset is unable to be distributed directly by us due to CAPRI guidelines, but can be derived from "Scoring round" models from the CAPRI Website .
The targets considered were T122-T125, T131-T133, and T136, as these were targets which contained globular protein ligands and receptors. Please contact us directly if you have any further questions on this dataset.
创建时间:
2021-06-25



