five

Dataset of AI-guided Multi-Configuration Dirac-Fock (MCDF) calculations for atoms and ions with 1 ≤ Z ≤ 118 and 1 ≤ N ≤ 10

收藏
DataCite Commons2026-04-20 更新2026-05-04 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/5RV42G
下载链接
链接失效反馈
官方服务:
资源简介:
Datatoms 2026-1z118-1n10 is a large-scale computational atomic physics dataset containing Multi-Configuration Dirac-Fock (MCDF) calculation results for atoms and ions with nuclear charge 1 ≤ Z ≤ 118 and electron numbers 1 ≤ N ≤ 10. The dataset provides structured, high-accuracy atomic structure data for more than 1,000 atomic systems, covering over 400,000 atomic states and more than 100 million radiative transitions. Nature of the Data The dataset consists of relativistic atomic structure calculations performed within a Multi-Configurational Self-Consistent Field (MCSCF) framework, including electron-electron correlation effects and Quantum Electrodynamics (QED) corrections. The dataset provides: Total energies Excitation energies relative to the ground state Mean orbital radii Landé g-factors Radial electron densities Configuration weights in jj-coupling Convergence diagnostics and error estimates Radiative transition energies and transition probabilities Associated experimental observations when matching transitions are available Predicted lifetimes and linewidths The data is distributed as JSONL (JSON Lines) files, enabling streaming and large-scale processing. Each file corresponds to a specific atom or ion identified by its number of electrons (N), protons (Z), and nucleons (A). The format is fully documented in the accompanying README file. Context of Production The dataset was produced using PyMDFGME, a Python wrapper and extension of the Multiconfiguration Dirac-Fock General Matrix Elements (MDFGME) atomic structure solver. Calculations were performed at the Commissariat à l'énergie atomique et aux énergies alternatives (CEA) on the EXA supercomputer. The production required approximately 200 million CPU hours distributed over 64,000 cores, representing an estimated computational cost of about €1 million. More than 12 TB of raw data were generated, from which a curated subset (a few GB) is distributed in this repository. The original scientific objective was to build a training dataset for a neural network intended to predict jj-configuration weights prior to multi-configuration atomic structure calculations. Such a model is expected to enable a preselection of the most important configurations, thereby reducing the size of the configuration space and the overall computational cost while maintaining good predictive accuracy. To enable the production of this dataset at scale, it was necessary to automate the selection of convergence parameters for the MDFGME solver, which are traditionally chosen manually. For this purpose, a first neural network, trained on a smaller, previously generated dataset, was used during the production process. Its role was to automatically select suitable convergence parameters, making it possible to carry out fully systematic large-scale calculations, improving robustness and reducing the need for manual intervention. Beyond its initial machine-learning motivation, the dataset was extended and structured to maximize its scientific value for the broader community. Potential applications include: Identification of chemical elements from spectroscopic data Search for metastable atomic states Benchmarking of atomic structure solvers Studies of relativistic and correlation effects along isoelectronic sequences Investigation of configuration mixing in highly charged ions Training of surrogate models for atomic structure calculations Plasma opacity calculations
提供机构:
Recherche Data Gouv
创建时间:
2026-03-04
二维码
社区交流群
二维码
科研交流群
商业服务