QM1B: one billion quantum mechanical simulations containing 9-11 heavy atoms
收藏DataCite Commons2025-06-01 更新2024-07-13 收录
下载链接:
https://plus.figshare.com/articles/dataset/QM1B_one_billion_quantum_mechanical_simulations_containing_9-11_heavy_atoms/24459376/1
下载链接
链接失效反馈官方服务:
资源简介:
This is the permanent storage location for the dataset described in Generating QM1B with PySCF<sub>IPU</sub> and documented by the accompanying Datasheet.QM1B is a low-resolution DFT dataset generated using PySCF IPU. It is composed of one billion training examples containing 9-11 heavy atoms. It was created by taking 1.09M SMILES strings from the GDB-11 database and computing molecular properties (e.g. HOMO-LUMO gap) for a set of up to 1000 conformers per molecule.We provide both the source code for PySCF<sub>IPU</sub> and dataset tools for using the QM1B dataset.Dataset schemaSee the QM1B datasheet for detailed documentation following the datasheets for datasets framework.QM1B dataset is stored in the open-source columnar Apache Parquet format, with the following schema:smile: The SMILES string taken from GDB11. There are up to 1000 rows (i.e. conformers) with the same SMILES string.atoms: String representing the atom symbols of the molecule, e.g. ”COOH”.z: Integer representation of atoms used by SchNet (the atomic numbers).energy: energy of the molecule computed by PySCF IPU (unit eV).homo: The energy of the Highest Occupied Molecular Orbital (HOMO) (unit eV).lumo: The energy of the Lowest occupied Molecular Orbital (LUMO) (unit eV).N: The number of atomic orbitals for the specific DFT computation (depends on the basis set STO3G).std: The standard deviation of the energy of the last five iterations of running PySCFIPU, used as convergence criteria std < 0.01 (unit eV).y: The HOMO-LUMO Gap (unit eV).pos: The atom positions (unit Bohr).Further examples for working with this dataset are available in accompanying github repo where we welcome contributions.
提供机构:
Figshare+
创建时间:
2023-11-08



