Learning Properties of Ordered and Disordered Materials from Multi-fidelity Data
收藏Figshare2020-10-01 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Learning_Properties_of_Ordered_and_Disordered_Materials_from_Multi-fidelity_Data/13040330/1
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains two datasets for our recent work "Learning Properties of Ordered and Disordered Materials from Multi-fidelity Data". <br>The first data set is a multi-fidelity band gap data for crystals, and the second data set is the molecular energy data set for molecules.<br>1. Multi-fidelity band gap data for crystals<br><br>The full band gap data used in the paper is located at `band_gap_no_structs.gz`. Users can use the following code to extract it. <br>```import gzipimport json<br>with gzip.open("band_gap_no_structs.gz", "rb") as f: data = json.loads(f.read())```<br>`data` is a dictionary with the following format<br>```{"pbe": {mp_id: PBE band gap, mp_id: PBE band gap, ...},"hse": {mp_id: HSE band gap, mp_id: HSE band gap, ...},"gllb-sc": {mp_id: GLLB-SC band gap, mp_id: GLLB-SC band gap, ...},"scan": {mp_id: SCAN band gap, mp_id: SCAN band gap, ...},"ordered_exp": {icsd_id: Exp band gap, icsd_id: Exp band gap, ...},"disordered_exp": {icsd_id: Exp band gap, icsd_id: Exp band gap, ...}}```where `mp_id` is the Materials Project materials ID for the material, and `icsd_id` is the ICSD materials ID. For example, the PBE band gap of NaCl (mp-22862, band gap 5.003 eV) can be accessed by `data['pbe']['mp-22862']`. Note that the Materials Project database is evolving with time and it is possible that certain ID is removed in latest release and there may also be some band gap value change for the same material. <br>To get the structure that corresponds to the specific material id in Materials Project, users can use the `pymatgen` REST API. <br>1.1. Register at Materials Project [https://www.materialsproject.org](https://www.materialsproject.org) and get an `API` key.1.2. In python, do the following to get the corresponding computational structure.<br> ``` from pymatgen import MPRester mpr = MPRester(#Your API Key) structure = mpr.get_structure_by_material_id(#mp_id) ```A dump of all the material ids and structures for 2019.04.01 MP version is provided here: [https://ndownloader.figshare.com/files/15108200](https://ndownloader.figshare.com/files/15108200). Users can download the file and extract the `material_id` and `structure` from this file for all materials. The `structure` in this case is a `cif` file. Users can use again `pymatgen` to read the cif string and get the structure. <br>```from pymatgen.core import Structurestructure = Structure.from_str(#cif_string, fmt='cif')```<br>For the ICSD structures, the users are required to have commercial ICSD access. Hence the structures will not be provided here.<br><br>2. Multi-fidelity molecular energy data<br>The `molecule_data.zip` contains two datasets in `json` format. <br>2.1 `G4MP2.json` contains two fidelity G4MP2 (6095) and B3LYP (130831) calculations results on QM9 molecules <br>```{"G4MP2": {"U0": {ID: G4MP2 energy (eV), ...}, { "molecules": {ID: Pymatgen molecule dict, ...}},"B3LYP": {"U0": {ID: B3LYP energy (eV), ...} {"molecules": {ID: Pymatgen molecule dict, ...}}}```<br>2.2 `qm7b.json` contains the molecule energy calculation resultsi for 7211 molecules using HF, MP2 and CCSD(T) methods with 6-31g, sto-3g and cc-pvdz bases. <br>```{"molecules": {ID: Pymatgen molecule dict, ...},"targets": {ID: {"HF": {"sto3g": Atomization energy (kcal/mol), "631g": Atomization energy (kcal/mol), "cc-pvdz": Atomization energy (kcal/mol)}, "MP2": {"sto3g": Atomization energy (kcal/mol), "631g": Atomization energy (kcal/mol), "cc-pvdz": Atomization energy (kcal/mol)}, "CCSD(T)": {"sto3g": Atomization energy (kcal/mol), "631g": Atomization energy (kcal/mol), "cc-pvdz": Atomization energy (kcal/mol)}, ...}}}``` <br>
创建时间:
2020-10-01



