Data sets and machine learning models for: Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8049536
下载链接
链接失效反馈官方服务:
资源简介:
The datasets and final machine learning model files for the manuscript "Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates". Citation should refer directly to the manuscript:
Chung, Y.; Green, W. H. Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates. ChemRxiv 2023, doi: 10.26434/chemrxiv-2023-f20bg
To use the machine learning models, please refer to the sample files and instructions on https://github.com/yunsiechung/chemprop/tree/RxnSolvKSE_ML.
Detailed information can be found in README.md file.
Details on the files
In the pretraining and finetuning set csv files, each column represents:
rxn_smiles: atom-mapped reaction SMILES
solvent_smiles: solvent SMILES
ddGsolv: solvation free energy of activation of a reaction-solvent pair at 298K in kcal/mol (main prediction target)
ddHsolv: solvation enthalpy of activation of a reaction-solvent pair at 298K in kcal/mol (main prediction target)
dGsolv_reactant: solvation free energy of reactant(s) at 298K in kcal/mol (additional feature)
dGsolv_product: solvation free energy of product(s) at 298K in kcal/mol (additional feature)
dHsolv_reactant: solvation enthalpy of reactant(s) at 298K in kcal/mol (additional feature)
dHsolv_product: solvation enthalpy of product(s) at 298K in kcal/mol (additional feature)
Data sets under 'RxnSolvKSE_dataset_v1.0.zip'
pretraining_set: contains the dataset used for pre-training
all_data: contains all calculated data
pretraining_rxn_solvent_ddGsolv_ddHsolv_with_features_all.csv: contains both main prediction targets and additional feature for reaction-solvent pairs
pretraining_solvent_info.csv: list of all solvents
pretraining_unique_rxn.csv: list of all reactions, both forward and reverse directions
chosen_500k_data: contains the chosen 500k data
pretraining_rxn_solvent_ddGsolv_ddHsolv_500k.csv: contains main prediction targets for reaction-solvent pairs
pretraining_features_react_prod_dGsolv_dHsolv_500k.csv: contains additional features for reaction-solvent pairs
finetuning_set: contains the dataset used for fine-tuning
all_data: contains all calculated data
finetuning_rxn_solvent_ddGsolv_ddHsolv_with_features_all.csv: constains both main prediction targets and additional features for reaction-solvent pairs. The rxn_key column indicates whether the reaction is bimolecular hydrogen abstraction (bihabs), unimolecular hydrogen migration (intrahabs), or radical addition to a multiple bond (raddition). The 'fwd' and 'rev' each indicate forward and reverse reactions.
finetuning_solvent_info.csv: list of all solvents
finetuning_unique_rxn.csv: list of all reactions, both forward and reverse directions
chosen_data: contains chosen data
finetuning_rxn_solvent_ddGsolv_ddHsolv_chosen.csv: contains main prediction targets for reaction-solvent pairs
finetuning_features_react_prod_dGsolv_dHsolv_chosen.csv: contains additional features for reaction-solvent pairs
experimental_set/expt_rxn_atom_mapped_smiles.csv: contains the atom-mapped reaction SMILES used for the experimental data.The original experimental data can be found at https://zenodo.org/record/7747557.
Machine learning model files under 'RxnSolvKSE_ML_model_files.zip'
Contains the Chemprop machine learning model files for predicting ddGsolv and ddHsolv for a reaction-solvent pair. It takes atom-mapped reaction SMILES and solvent SMILES as inputs.
To use these ML models, please refer to the sample files and instructions on https://github.com/yunsiechung/chemprop/tree/RxnSolvKSE_ML
创建时间:
2023-10-10



