five

EquiBind data (EquiBind preprocessing of PDBBind v2020)

收藏
OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/EquiBind_data
下载链接
链接失效反馈
官方服务:
资源简介:
PDBBind v2020 的蛋白质-配体复合物如论文“EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction”中所述进行预处理,相关代码位于 https://github.com/HannesStark/EquiBind 包含 19119 个 PDBBinds 复合物,共 19433 个蛋白质-配体复合物。不包括那些无法使用 RDKit 加载配体文件的那些。论文摘要:预测类药物分子如何与特定蛋白质靶标结合是药物发现的核心问题。一种极快的计算绑定方法将实现快速虚拟筛选或药物工程等关键应用。现有方法的计算成本很高,因为它们依赖于大量候选采样以及评分、排名和微调步骤。我们用 EQUIBIND 挑战这一范式,这是一种 SE(3) 等变几何深度学习模型,对 i) 受体结合位置(盲对接)和 ii) 配体的结合姿势和方向进行直接预测。与传统和最近的基线相比,EquiBind 实现了显着的加速和更好的质量。此外,当以增加运行时间为代价将其与现有微调技术相结合时,我们展示了额外的改进。最后,我们提出了一种新颖且快速的微调模型,该模型基于与给定输入原子点云的 von Mises 角距离的封闭式全局最小值来调整配体可旋转键的扭转角,避免了以前昂贵的能量差分演化策略最小化。

The protein-ligand complexes from PDBBind v2020 were preprocessed as described in the paper "EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction", with the corresponding code hosted at https://github.com/HannesStark/EquiBind. This dataset includes 19,119 PDBBind complexes, totaling 19,433 protein-ligand complexes, excluding those whose ligand files cannot be loaded via RDKit. Paper abstract: Predicting how drug-like molecules bind to specific protein targets is a core problem in drug discovery. An ultra-fast computational binding prediction approach would enable critical applications such as rapid virtual screening or drug engineering. Existing methods incur high computational costs, as they rely on extensive candidate sampling followed by scoring, ranking, and fine-tuning steps. We challenge this paradigm with EQUIBIND, an SE(3)-equivariant geometric deep learning model that directly predicts i) the receptor binding site (blind docking) and ii) the binding pose and orientation of the ligand. Compared to traditional and recent baseline methods, EquiBind achieves significant speedups and better binding prediction quality. Furthermore, when combined with existing fine-tuning techniques at the cost of increased runtime, we demonstrate additional performance improvements. Finally, we propose a novel and fast fine-tuning model that adjusts the torsion angles of the ligand's rotatable bonds based on the closed-form global minimum of the von Mises angular distance relative to the given input atomic point cloud, avoiding the previously costly energy-based differential evolution minimization strategies.
提供机构:
OpenDataLab
创建时间:
2022-05-23
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是PDBBind v2020的预处理版本,专为EquiBind模型设计,包含19119个复合物对应的19433个蛋白质-配体结构,排除了RDKit无法处理的配体文件。EquiBind是一种SE(3)等变几何深度学习模型,用于直接预测药物分子与蛋白质的结合,具有快速和高性能优势,相关论文于2022年由麻省理工学院和慕尼黑工业大学发布。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作