ORDerly benchmarks of chemical reactions
收藏DataCite Commons2025-06-01 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/ORDerly_chemical_reactions_condition_benchmarks/23298467/4
下载链接
链接失效反馈官方服务:
资源简介:
Benchmark datasets generated with ORDerly for chemical reaction prediction tasksORDerly-forward: Forward reaction prediction (predict reaction products given reactants, solvents, and agents)ORDerly-retro: Retrosynthesis prediction (prediction reactants given a desired product)ORDerly-condition: Reaction condition prediction (predict solvents and agents given reactants and products). Note that reactions with rare solvents and agents (frequency <100) have been removed.ORDerly-condition-with-rare: Reaction condition prediction (predict solvents and agents given reactants and products). Reactions with rare solvents and agents have not been removed.Config: Contains the .log and .json files showing the parameters used in cleaning and the impact on dataset size after each cleaning step.Note that all datasets here were created using the reaction string and our chemically informed logic to assign reaction roles.Paper: https://chemrxiv.org/engage/chemrxiv/article-details/64ca5d3e4a3f7d0c0d78ca42Neurips workshop paper: https://openreview.net/forum?id=R8FQMsECISCode: https://github.com/sustainable-processes/orderlyThe supplementary datasets used for this work can be found here: https://doi.org/10.6084/m9.figshare.23502372.v3Feel free to email me, Daniel Wigh, at dsw46@cam.ac.uk or my supervisor Alexei A. Lapkin.<br><br>
本基准数据集由ORDerly生成,用于各类化学反应预测任务,具体包含以下四类子数据集:
1. ORDerly-forward:正向反应预测数据集(给定反应物、溶剂与试剂,预测反应产物);
2. ORDerly-retro:逆合成预测数据集(基于目标产物预测所需反应物);
3. ORDerly-condition:反应条件预测数据集(给定反应物与产物,预测所需溶剂与试剂。注:已移除溶剂或试剂出现频率低于100的反应样本);
4. ORDerly-condition-with-rare:带稀有样本的反应条件预测数据集(给定反应物与产物,预测所需溶剂与试剂,未移除溶剂或试剂出现频率低于100的反应样本)。
【配置文件】包含记录数据清洗过程中所用参数,以及各清洗步骤后数据集规模变化的.log与.json格式文件。
注:本数据集集中所有样本均基于反应字符串与我们的化学领域专业逻辑进行反应角色标注。
相关论文:https://chemrxiv.org/engage/chemrxiv/article-details/64ca5d3e4a3f7d0c0d78ca42
NeurIPS 研讨会论文:https://openreview.net/forum?id=R8FQMsECIS
代码仓库:https://github.com/sustainable-processes/orderly
本研究所用补充数据集可通过以下链接获取:https://doi.org/10.6084/m9.figshare.23502372.v3
如有任何疑问,可联系作者Daniel Wigh(邮箱:dsw46@cam.ac.uk)或其导师Alexei A. Lapkin。
提供机构:
figshare
创建时间:
2024-02-05
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是用于化学反应预测任务的基准数据集,包含正向反应预测、逆合成预测和反应条件预测三种类型,采用化学信息逻辑处理反应数据,并附带相关研究论文和代码资源。
以上内容由遇见数据集搜集并总结生成



