five

A database of vacancy formation enthalpies for materials discovery

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/5999072
下载链接
链接失效反馈
官方服务:
资源简介:
A database of vacancy formation enthalpies for materials discovery Matthew Witmana, Anuj Goyalb, Tadashi Ogitsuc, Anthony McDaniela, Stephan Lanyb a Sandia National Laboratories, b National Renewable Energy Laboratory, c Lawrence Livermore National Laboratories Abstract This dataset provides DFT calculations of cation and oxygen vacancy defects in oxides which can be used to derive efficient data-driven models for vacancy formation enthalpy. DFT calculations were performed as described in , where a graph neural network surrogate model was trained and used to screen the Materials Project for promising solar thermochemical water splitting materials. The data, models, scripts and code needed to reproduce the results in are described below. Data & Models 1) data_01_03_22/* corresponds to oxide compounds used in model training 2) known_cmpds/* corresponds to known STCH compounds 3) screeningMP/* corresponds to the screening related data screening_inelements/* stores only Materials Project oxides whose composition is a subset of the training elements and contains all the vacancy defect predictions MP_O_PDs/* stores offline PDs from Materials Project so that adjusting oxide stability metrics can be done somewhat rapidly MP_O_Compounds/* stores possible MP oxide compounds to screen In general, the above folders contain: DFT data/structures are included in sub-directories: poscars, magnetic moments, oxidation states, and csvs (containing the vacancy enthalpy for each unique site) cgcnn/* contains the processed DFT data for use in the CGCNN code (see Scripts for how to prepare this) id_prop.csv.* contains [cif name, defect formation enthalpy] pairs Different id_prop.csv.* files correspond to different K-fold stratifications in the screening directory, defect formation enthalpy is omitted since it has not been computed with DFT model-(X1)k(X2)_(X3)_(X4) corresponds to different CV models for X1 different training set sizes (i.e., try to train with only 10%, 40%, or 100% of the data) X2 different k folds X3 = "struct" or "" for "structure-wise validation" or "defect-wise validation", respectively X4 for different encoding strategies structure X-Yz.cif indicates structure X, defect element Y, symmetry site z, where one instance of that site has been re-ordered to be the first atom in the cif file *.locals contains a one-hot encoding of oxidation states of all sites in that crystal *.locals_continuous contains a continuous encoding of oxidation state in that crystal *.globals contains global properties of the host structure Scripts scripts/*.sh scripts to rerun the screenings for different k-folds, encodings, etc. scripts/*.ipynb to analyze results scripts/prepare_cgcnn.py for translating the data in (poscars/*, csvs/*, oxstate/*, mags/*) to the ML input needed in cgcnn/* Code Install CGCNN and its defect modifications from https://github.com/mwitman1/cgcnndefect Questions/Collaborations Please contact mwitman@sandia.gov Acknowledgements This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Energy Efficiency and Renewable Energy (EERE), specifically the Hydrogen and Fuel Cell Technologies Office. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. Part of the work was performed under the auspices of the US Department of Energy by Lawrence Livermore National Laboratory under contract No.~DE- AC52-07NA27344. The National Renewable Energy Laboratory (NREL) is operated by the Alliance for Sustainable Energy, LLC, for the DOE under Contract No.~DE-AC36-08GO28308. This work used High-Performance Computing resources at NREL, sponsored by DOE-EERE. The views expressed in this article do not necessarily represent the views of the U.S. Department of Energy or the United States Government.
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作