Toy dataset with pairs of molecules differing in total charge
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10852522
下载链接
链接失效反馈官方服务:
资源简介:
On the inclusion of charge and spin states in Cartesian tensor neural network potentials. (2024)Guillem Simeon, Antonio Mirarchi, Raul P. Pelaez, Raimondas Galvelis and Gianni De Fabritiis.
Two toy datasets, Dataset A and Dataset B, each one comprising five members of five pairs ofunique molecular systems, each of the elements in the pair differing in total charge. These pairs are indistinguishable to a neural network that does not account for total charge,as they share identical atomic numbers and geometric configurations.The datasets include total charges, Gasteiger partial charges computed with RDKit,and calculated energies and atomic forces for 2000 conformers per molecule using GFN2-xTB.Therefore, each dataset contains a total of 10k data points.Conformers were generated by minimizing each molecule, displacing atomic positions with Gaussiannoise with a standard deviation of 0.2Å, and filtering them such that maximum atomic forces are <100 eV/Å.
Data Format and Availability:
Both datasets are available in HDF5 (Hierarchical Data Format version 5) format,compatible with Ace dataset (layout version 2.0) in TorchMD-Net (https://github.com/torchmd/torchmd-net/blob/main/torchmdnet/datasets/ace.py).
Data Structure:
Each molecule is assigned a unique integer identifier, consistent across both sub-datasets,facilitating pairing between conformations.
The HDF5 file is structured as follows:
Each top-level group corresponds to a unique molecule.
Within each group, considering N is the number of atoms in the molecule and M is the number ofconformations, the following datasets are provided:
- atomic_numbers: Atomic numbers of atoms in the molecule, with a shape of (N,);
- formal_charges: Formal charges of atoms in the molecule, with a shape of (N,);
- formation_energies: Formation energy of each conformation, with a shape of (M,), expressed in units of eV;- forces: Forces on atoms in each conformation, with a shape of (M, N, 3), expressed in eV/Å.
- partial_charges: Gasteiger partial charges of atoms in each conformation, with a shape of (M, N), expressed in e;
- positions: Cartesian coordinates of atoms in each conformation, with a shape of (M, N, 3), expressed in Å.
创建时间:
2024-03-22



