CrossDocked2020
收藏DataCite Commons2024-11-11 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/documents/crossdocked2020
下载链接
链接失效反馈官方服务:
资源简介:
In structure-based drug design (SBDD), a major challenge is generating high-affinity 3D ligand molecules that can effectively bind to specific protein targets, which requires accurately capturing complex protein-ligand interactions. Although existing diffusion models have demonstrated potential in molecular generation tasks, they often struggle with accurately capturing the complex interactions between proteins and ligands. To address this problem, we propose MSIDiff, a multi-stage interaction-aware diffusion model for protein-specific molecular generation. MSIDiff uses the pre-trained model MSINet to extract real protein-ligand interaction information during the initial diffusion stage and incorporates this information into the reverse process to ensure that the generated molecules exhibit accurate interaction relationships with target proteins. Through a scoring mechanism, MSIDiff filters key nodes to extract crucial protein-ligand interaction data and uses the cross-layer interaction update module with GRU to recursively integrate information from different denoising stages, enabling effective cross-layer information transmission. Experimental results on the CrossDocked2020 dataset show that MSIDiff can generate molecules with more realistic 3D structures and higher binding affinity to protein targets, achieving an Avg. Vina Score of up to -6.36, while maintaining appropriate molecular properties.
在基于结构的药物设计(structure-based drug design, SBDD)中,一项核心挑战在于生成可与特定蛋白靶点有效结合的高亲和力3D配体分子,这要求精准捕捉复杂的蛋白-配体相互作用。尽管现有扩散模型已在分子生成任务中展现出应用潜力,但它们往往难以精准捕捉蛋白与配体间的复杂交互关系。为解决这一问题,我们提出MSIDiff——一种面向蛋白特异性分子生成的多阶段交互感知扩散模型。MSIDiff在初始扩散阶段借助预训练模型MSINet提取真实的蛋白-配体交互信息,并将该信息融入反向去噪过程,以确保生成的分子与目标蛋白具备准确的交互关联。通过评分机制筛选关键节点以提取核心蛋白-配体交互数据,并结合门控循环单元(GRU)构建跨层交互更新模块,递归整合不同去噪阶段的信息,实现高效的跨层信息传递。在CrossDocked2020数据集上的实验结果表明,MSIDiff可生成更具真实感的3D分子结构,且与蛋白靶点的结合亲和力更高,最高可达平均Vina评分-6.36,同时兼顾了合理的分子理化性质。
提供机构:
IEEE DataPort
创建时间:
2024-11-11
搜集汇总
数据集介绍

背景与挑战
背景概述
CrossDocked2020是一个用于结构基础药物设计的数据集,专注于生成高亲和力3D配体分子以结合特定蛋白质靶点。该数据集属于生物医学与健康科学领域,支持多种文件格式,但具体文件未在详情页中提供。
以上内容由遇见数据集搜集并总结生成



