five

Prediction rigidities for data-driven chemistry

收藏
doi.org2024-08-28 更新2025-03-23 收录
下载链接:
https://doi.org/10.24435/materialscloud:6x-gs
下载链接
链接失效反馈
官方服务:
资源简介:
The widespread application of machine learning (ML) to the chemical sciences is making it very important to understand how the ML models learn to correlate chemical structures with their properties, and what can be done to improve the training efficiency whilst guaranteeing interpretability and transferability. In this work, we demonstrate the wide utility of prediction rigidities, a family of metrics derived from the loss function, in understanding the robustness of ML model predictions. We show that the prediction rigidities allow the assessment of the model not only at the global level, but also on the local or the component-wise level at which the intermediate (e.g. atomic, body-ordered, or range-separated) predictions are made. We leverage these metrics to understand the learning behavior of different ML models, and to guide efficient dataset construction for model training. We finally implement the formalism for a ML model targeting a coarse-grained system to demonstrate the applicability of the prediction rigidities to an even broader class of atomistic modeling problems. This record contains all the data used for the analyses conducted in the associated work published in Faraday Discussions.

机器学习(ML)在化学科学领域的广泛应用使得理解机器学习模型如何关联化学结构与它们的性质变得极为重要,同时,确保模型的可解释性和可迁移性,提高训练效率亦变得至关重要。在本研究中,我们展示了预测刚性这一系列由损失函数衍生出的指标在理解机器学习模型预测鲁棒性方面的广泛用途。我们发现,预测刚性不仅允许对模型进行全局层面的评估,还能够对局部或构成层面的预测(例如原子、有序体或范围分离的预测)进行评估。我们利用这些指标来理解不同机器学习模型的学习行为,并指导高效数据集的构建以用于模型训练。最后,我们针对粗粒度系统实现了一种机器学习模型的形式化,以展示预测刚性在更广泛的原子建模问题中的应用。此记录包含了在Faraday Discussions上发表的相关工作中所进行分析的全部数据。
提供机构:
doi.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作