AiZynthTrain: Robust, Reproducible, and Extensible Pipelines for Training Synthesis Prediction Models
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/AiZynthTrain_Robust_Reproducible_and_Extensible_Pipelines_for_Training_Synthesis_Prediction_Models/22331076
下载链接
链接失效反馈官方服务:
资源简介:
We introduce the AiZynthTrain Python package for training
synthesis
models in a robust, reproducible, and extensible way. It contains
two pipelines that create a template-based one-step retrosynthesis
model and a RingBreaker model that can be straightforwardly integrated
in retrosynthesis software. We train such models on the publicly available
reaction data set from the U.S. Patent and Trademark Office (USPTO),
and these are the first retrosynthesis models created in a completely
reproducible end-to-end fashion, starting with the original reaction
data source and ending with trained machine-learning models. In particular,
we show that employing new heuristics implemented in the pipeline
greatly improves the ability of the RingBreaker model for disconnecting
ring systems. Furthermore, we demonstrate the robustness of the pipeline
by training on a more diverse but proprietary data set. We envisage
that this framework will be extended with other synthesis models in
the future.
创建时间:
2023-03-24



