five

SHARP : Generating Synthesizable Molecules via Fragment-based Hierarchical Action-space Reinforcement Learning for Pareto Optimization

收藏
DataCite Commons2025-07-26 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/SHARP_Generating_Synthesizable_Molecules_via_Fragment-based_Hierarchical_Action-space_Reinforcement_Learning_for_Pareto_Optimization/29597885
下载链接
链接失效反馈
官方服务:
资源简介:
Designing drug-like molecules that satisfy multiple objectives—such as high binding affinity, synthesizability, and drug-likeness—poses a complex global optimization problem over an astronomically large chemical space. Existing deep learning-based molecular generative models often treat this task as distribution modeling, relying on atom-level autoregressive actions with less consideration of explicit optimization feedback. Consequently, they frequently generate invalid structures, converge to local optima, or produce synthetically infeasible candidates. Here, we introduce SHARP (Synthesizable Hierarchical Action-space Reinforcement learning for Pareto optimization), a molecular generator that addresses these limitations via a fragment-based hierarchical action space and reinforcement learning. SHARP ensures synthetic accessibility by applying action masks guided by a pretrained Synthesizability Estimation Model (SEM). The reinforcement learning (RL) policy is trained using a composite reward function integrating docking scores, pharmacophore matching, and solvent accessibility to generate functionally relevant and experimentally tractable molecules. Furthermore, across four lead optimization tasks—fragment growing, linker design, scaffold hopping, and sidechain decoration—on a diverse receptor set, SHARP consistently outperforms prior methods in producing molecules at high affinity and synthesizability. These results demonstrate that reinforcement learning with a chemically intuitive action space design can be an effective solution to the optimization challenges in AI-driven drug discovery, offering a robust framework for rational molecular design in structure-based applications.

设计满足高结合亲和力、可合成性与类药性等多重目标的类药分子,是在极其庞大的化学空间中求解复杂全局优化问题的任务。现有的基于深度学习的分子生成模型常将该任务视作分布建模,依赖原子级自回归生成操作,却极少考虑显式的优化反馈。因此这类模型常生成无效结构、收敛至局部最优,或产出合成不可行的候选分子。本文提出SHARP(面向帕累托优化的可合成性分层动作空间强化学习,Synthesizable Hierarchical Action-space Reinforcement learning for Pareto optimization),一款基于片段分层动作空间与强化学习方法构建的分子生成模型,可解决上述局限。SHARP通过由预训练可合成性估计模型(Synthesizability Estimation Model,SEM)引导的动作掩码,保障分子的合成可行性。该强化学习(Reinforcement Learning,RL)策略通过整合对接分数、药效团匹配度与溶剂可及性的复合奖励函数进行训练,以生成具备功能相关性且实验可行的分子。此外,在多样化受体数据集的四项先导优化任务——片段生长、连接子设计、骨架跃迁与侧链修饰——中,SHARP在生成高亲和力与高可合成性分子方面始终优于现有方法。上述结果表明,采用符合化学直觉的动作空间设计的强化学习方法,可有效解决AI驱动药物发现中的优化难题,为基于结构的理性分子设计提供了稳健的研究框架。
提供机构:
figshare
创建时间:
2025-07-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作