Datasets for Custom-trained Machine-learning Interatomic Potentials: Nitric Acid Aqueous Solution
收藏DataCite Commons2025-11-24 更新2026-04-25 收录
下载链接:
https://www.osti.gov/servlets/purl/3004762
下载链接
链接失效反馈官方服务:
资源简介:
This dataset was generated using an iterative active learning strategy with the ArcaNN software package (https://github.com/arcann-chem/arcann_training) to train machine-learning interatomic potentials (MLIPs) for aqueous nitric acid. Each active-learning cycle consisted of three stages: (1) training, (2) exploration, and (3) labeling. The initial training set comprised approximately 800 randomly selected configurations from a previous study by Lewis et al. (https://doi.org/10.1021/jp205510q), which investigated nitric acid solutions at 2, 3, 4, and 5 mol/L. For all configurations, single-point calculations of atomic forces and total energies were performed at the quantum density functional theory BLYP-D2 and PBE-D3 levels of theory using the CP2K Quickstep module. Valence electrons were treated explicitly, while core electrons on all atoms were represented by norm-conserving Goedecker–Teter–Hutter (GTH) pseudopotentials. Long-range dispersion interactions were accounted for using Grimme dispersion corrections. Wave functions were expanded in a mixed Gaussian-and-plane-wave scheme using TZV2P-MOLOPT basis sets for all elements and an 800 Ry auxiliary plane-wave cutoff for the electron density. Self-consistent field convergence was accelerated using orbital transformation and Direct Inversion in the Iterative Subspace, with a convergence threshold of 10^{-6}. All single-point calculations were carried out in periodic orthorhombic cells whose dimensions match those of the molecular configurations sampled from earlier trajectories. The CELL_REF keyword in CP2K was used to define a fixed reference cell, ensuring consistency in the reference data used for MLIP training, particularly when cell fluctuations are present in NpT simulations. The resulting high-fidelity energies and forces constitute the ground-truth labels used to train the MLIPs contained in this dataset.
提供机构:
PNNL (PNNL2)
创建时间:
2025-11-24



