five

QM1B: one billion quantum mechanical simulations containing 9-11 heavy atoms

收藏
Figshare2023-11-08 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/QM1B_one_billion_quantum_mechanical_simulations_containing_9-11_heavy_atoms/24459376
下载链接
链接失效反馈
官方服务:
资源简介:
This is the permanent storage location for the dataset described in Generating QM1B with PySCFIPU and documented by the accompanying Datasheet.QM1B is a low-resolution DFT dataset generated using PySCF IPU. It is composed of one billion training examples containing 9-11 heavy atoms. It was created by taking 1.09M SMILES strings from the GDB-11 database and computing molecular properties (e.g. HOMO-LUMO gap) for a set of up to 1000 conformers per molecule.We provide both the source code for PySCFIPU and dataset tools for using the QM1B dataset.Dataset schemaSee the QM1B datasheet for detailed documentation following the datasheets for datasets framework.QM1B dataset is stored in the open-source columnar Apache Parquet format, with the following schema:smile: The SMILES string taken from GDB11. There are up to 1000 rows (i.e. conformers) with the same SMILES string.atoms: String representing the atom symbols of the molecule, e.g. ”COOH”.z: Integer representation of atoms used by SchNet (the atomic numbers).energy: energy of the molecule computed by PySCF IPU (unit eV).homo: The energy of the Highest Occupied Molecular Orbital (HOMO) (unit eV).lumo: The energy of the Lowest occupied Molecular Orbital (LUMO) (unit eV).N: The number of atomic orbitals for the specific DFT computation (depends on the basis set STO3G).std: The standard deviation of the energy of the last five iterations of running PySCFIPU, used as convergence criteria std y: The HOMO-LUMO Gap (unit eV).pos: The atom positions (unit Bohr).Further examples for working with this dataset are available in accompanying github repo where we welcome contributions.
创建时间:
2023-11-08
二维码
社区交流群
二维码
科研交流群
商业服务