five

Quantum Descriptor-Based Machine-Learning Modeling of Thermal Hazard of Cyclic Sulfamidates

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Quantum_Descriptor-Based_Machine-Learning_Modeling_of_Thermal_Hazard_of_Cyclic_Sulfamidates/29919254
下载链接
链接失效反馈
官方服务:
资源简介:
Cyclic sulfamidates are commonly used building blocks in organic synthesis. Correct classification of their thermal criticality is crucial for the safe use of these compounds in process development and scale-up. In this study, building on our earlier work (Ferrari et al., 2022), we focused on modeling the reaction enthalpy of a family of 5-membered cyclic sulfamidates toward strong bases. The key challenge for the modeling task was the sparse availability of measured reaction enthalpies, with only 29 measurements available. To address this challenge, we used descriptors based on the quantum-chemical properties of the molecules, as they are more closely related to reaction enthalpies than typical cheminformatics-based descriptors. This approach allowed us to avoid relying solely on data-to-fit models and to focus instead on modeling reaction enthalpies using chemistry-aware techniques, which are more appropriate for small data sets. Three models were constructed using the quantum-chemical descriptors: the first one combining Partial Least Squares (PLS) regression with a Genetic Algorithm (GA), the second one based on the Least Absolute Shrinkage and Selection Operator (LASSO) method, and last, a Gaussian Process Regression (GPR) model. The three models achieved coefficients of determination of 0.78, 0.67, and 0.74, respectively. Although the absolute prediction error values were close to 100 J/g, it is noteworthy that all three techniques provided similar results and accurately classified nearly all compounds into their respective thermal criticality classes. This highlights the methodology’s effectiveness in providing a reliable framework for preliminary safety assessment and decision-making in process development.
创建时间:
2025-08-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作