Spatio-temporal learning from MD simulations for protein-ligand binding affinity prediction
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10390549
下载链接
链接失效反馈官方服务:
资源简介:
This Zenodo repository provides comprehensive resources for the paper titled "Spatio-temporal learning from MD simulations for protein-ligand binding affinity prediction". We created a dataset of 63,000 molecular dynamics simulations by performing 10 simulations of 10 ns on 6,300 complexes. Neural networks were developed to learn from this data in order to predict the binding affinities of protein-ligand complexes. The implementation of these neural networks are available on github. Our collection includes training/benchmark datasets, trained statistical models, and results on test sets (CSV & PDF files).
Training/benchmark datasets:
Training, validation and test sets are provided to train and evaluate the following neural networks:
Pafnucy, Proli and Densenucy without MD data augmentation (dataset file names contain "initial")
Pafnucy, Proli and Densenucy with MD data augmentation (dataset file names contain "MDDA")
Pafnucy with/without MD data augmentation and Proli and Densenucy with MD data augmentation were also evaluated on the fep test set (test set file name contain "fep")
Timenucy and Videonucy using spatiotemporal learning methods (dataset file names contain "4D")
Pafnucy without MD data augmentation and on a reduced training set (dataset file names contain "reduced")
For each training methodology (MD data augmentation and spatiotemporal learning), we provide the data for the whole complex, only the ligand or only the protein. Additionally for spatiotemporal learning, we provide the data with only the ligand using the tracking mode.
Statistical models:
We provide the models trained with Pafnucy, Proli, Densenucy, Timenucy and Videonucy. Each models were trained in 10 replicates.
For Pafnucy, Proli, Densenucy, we provide the models trained with random and systematic rotations, as well as with or without MD data augmentation.
For Proli, Densenucy, Timenucy and Videonucy, we provide the models trained on the whole complex, only the ligand or only the protein.
For Pafnucy we also provide the models trained on the reduced set (5932 complexes).
Results on test sets (CSV & PDF files):
We provide the predictions on the PDBbind v.2016 core set.
For spatiotemporal learning methods (Timenucy and Videonucy), there are predictions for only 83 complexes, as we did not perform simulations on the whole test set.
For models trained with MD DA, predictions were carried on the crystallographic structures as well as on the frames extracted from the simulations performed on the test set (augmented test).
Results on the FEP dataset are also provided for Pafnucy, Proli and Densenucy.
Due to the large size of the raw MD data (~4.5 To), we are not able to share this data on zenodo, and will provide it upon demand.
This work was performed using HPC resources from GENCI-IDRIS (Grant 2021-A0100712496 & 2022-AD011013521) and CRIANN (Grant 2021002).
创建时间:
2024-06-06



