five

Can Machines “Learn” Halide Perovskite Crystal Formation without Accurate Physicochemical Features?

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Can_Machines_Learn_Halide_Perovskite_Crystal_Formation_without_Accurate_Physicochemical_Features_/12461924
下载链接
链接失效反馈
官方服务:
资源简介:
Discovery of new perovskite materials is motivated by a broad range of materials applications and accelerated by recent advances in machine learning (ML). We herein report dataset augmentation, benchmarking, and interrogation for an ongoing experimental campaign consisting of 9483 halide perovskite synthesis experiments. To address limitations in previous work, we developed an improved description of the reactant concentrations in the experiments (validated against experimental observations) and performed experiments quantifying the excess volume of mixing of γ-butyrolactone/formic acid mixtures used in the perovskite syntheses. Combining this improved description of reactant concentration with other physicochemical features of the reactants, we constructed 1108 ML models to elucidate the roles of the algorithm (k-nearest neighbors, linear support-vector machine, and gradient boosted tree), feature set (12 in total), preprocessing regime (e.g., standardization), and training data holdout scheme on ML predictive ability. ML comparisons illustrated that the chemical accuracy of less sophisticated physical models in a dataset do not hinder interpolative model performance. Analysis of feature contributions showed how ML models “learn” competitive representations for concentration using raw experimental descriptions. Interrogation of the most performant models indicated that the numerical values of physicochemical features were not important, rather these features were being used to identify and interpolate within a particular reactant set. ML models were shown to be capable of making rudimentary extrapolations to untrained chemical systems when compared against basic benchmarks, and models which included the newly developed chemical features were shown to be more reliable than models trained without. These results illustrate how a stepwise comparative approach to machine learning can provide insight into what and how much models are “learning” for a given prediction task.
创建时间:
2020-05-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作