five

Advancing chemical synthesis with machine learning: opportunities and limitations

收藏
DataCite Commons2024-11-11 更新2025-04-17 收录
下载链接:
https://curate.nd.edu/articles/dataset/Advancing_chemical_synthesis_with_machine_learning_opportunities_and_limitations/26858434
下载链接
链接失效反馈
官方服务:
资源简介:
With advancements in computational power and increased data availability, machine learning (ML) has been applied in predicting chemical reactions and proposing synthetic pathways. This thesis contributes to advancing chemical reaction discovery through ML across three primary domains. Initially, computational methods were used to analyze transition states in reaction routes generated by the Sarpong group using Synthia™, evaluating their computational feasibility. Then, industrial electronic lab notebook (ELN) data, supported by AZ, were processed and featurized. Various ML techniques, including Random Forests (RF), k-Nearest Neighbors (KNN), Neural Networks (NN), and Graph Neural Networks (GNN), were applied to predict reaction yields. Yield imbalances in HTE and ELN were addressed to enhance yield prediction in critical regions using imbalanced regression methods. Large Language Models (LLMs) were integrated for data extraction, solving inconsistencies in USPTO datasets from multiple sources, and investigating the intricate information space of reaction procedure through a specific study on t-butyl ester deprotection. In the second part, substantial advancements were achieved in Molecular Representation Learning (MRL) to accurately capture molecular structures and physical behavior. By evaluating 3D GNNs and conformer ensemble-based models, this research extends beyond traditional SMILES, fingerprints, and 2D molecular graphs, enhancing the precision of predictions for molecule and reaction-level properties. These improvements are crucial for tasks such as enantiomeric excess (ee) selectivity prediction and binding energy (BE) prediction studies.
提供机构:
University of Notre Dame
创建时间:
2024-08-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作