QM9 data for graph2mat
收藏DataCite Commons2024-08-06 更新2025-04-10 收录
下载链接:
https://data.dtu.dk/articles/dataset/QM9_data_for_graph2mat/26195282
下载链接
链接失效反馈官方服务:
资源简介:
<br>
Creators
------------
Pol Febrer (pol.febrer@icn2.cat, ORCID 0000-0003-0904-2234)
Peter Bjorn Jorgensen (peterbjorgensen@gmail.com, ORCID 0000-0003-4404-7276)
Arghya Bhowmik (arbh@dtu.dk, ORCID 0000-0003-3198-5116)
<br>
Related publication
-------------------
The dataset is published as part of the paper:
"GRAPH2MAT: UNIVERSAL GRAPH TO MATRIX CONVERSION FOR ELECTRON DENSITY PREDICTION"
(https://doi.org/10.26434/chemrxiv-2024-j4g21)
<br>
Short description
------------------
This dataset contains the Hamiltonian, Overlap, Density and Energy Density matrices
from SIESTA calculations of the QM9 dataset (https://doi.org/10.6084/m9.figshare.c.978904.v5)
<br>
SIESTA 5.0.0 was used to compute the dataset.
<br>
Contents
-----------------
<br>
The dataset has four directories:
<br>
- basis: Contains the files specifying the basis used for each atom.
- pseudos: Contains the pseudopotentials used for the calculation (obtained from
http://www.pseudo-dojo.org/, type NC SR (ONCVPSP v0.5), PBE, standard accuracy)
- runs: The results of running the SIESTA simulations. Contents are discussed next.
- splits: The data splits used in the published paper. Each file "splits_X.json"
contains the splits for training size X.
<br>
The "runs" directory contains one directory for each run, named with the index
of the run. Each directory contains:
- RUN.fdf, geom.fdf: The input files used for the SIESTA calculation.
- RUN.out: The log of the SIESTA run, which apar
- siesta.TSDE: Contains the Density and Energy Density matrices.
- siesta.TSHS: Contains the Hamiltonian and Overlap matrices.
<br>
Each matrix can be read using the sisl python package (https://github.com/zerothi/sisl)
like:
<br>
```python
import sisl
<br>
matrix = sisl.get_sile("RUN.fdf").read_X()
```
<br>
where X is hamiltonian, overlap, density_matrix or energy_density_matrix.
<br>
To reproduce the results presented in the paper, follow the documentation of the graph2mat
package (https://github.com/BIG-MAP/graph2mat).
<br>
<br>
Cite this data
------------------
https://doi.org/10.11583/DTU.c.7310005
© 2024 Technical University of Denmark
<br>
<br>
License
-----------------
This dataset is published under the CC BY 4.0 license.
This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.
<br>
数据集创作者
------------
波尔·费布雷尔(pol.febrer@icn2.cat,ORCID 0000-0003-0904-2234)
彼得·比约恩·约根森(peterbjorgensen@gmail.com,ORCID 0000-0003-4404-7276)
阿尔吉亚·鲍米克(arbh@dtu.dk,ORCID 0000-0003-3198-5116)
相关发表论文
-------------------
本数据集随以下论文一同发表:
"GRAPH2MAT: 用于电子密度预测的通用图到矩阵转换"
(https://doi.org/10.26434/chemrxiv-2024-j4g21)
简短描述
------------------
本数据集包含QM9数据集(https://doi.org/10.6084/m9.figshare.c.978904.v5)的SIESTA计算所得的哈密顿(Hamiltonian)矩阵、重叠(Overlap)矩阵、密度(Density)矩阵与能量密度(Energy Density)矩阵。
本次计算采用SIESTA 5.0.0版本完成。
数据集内容
-----------------
本数据集包含四个目录:
- basis:存放各原子所用基组的配置文件。
- pseudos:存放计算所用的赝势(该赝势获取自http://www.pseudo-dojo.org/,类型为NC SR(ONCVPSP v0.5)、PBE,精度为标准级)
- runs:SIESTA模拟运算的结果,具体内容如下。
- splits:发表论文中所用的数据划分方案。每个名为"splits_X.json"的文件对应训练集规模为X的数据划分配置。
`runs`目录下为每个运算生成一个以运算索引命名的子目录。每个子目录包含以下文件:
- RUN.fdf、geom.fdf:SIESTA计算所用的输入文件。
- RUN.out:SIESTA运行日志。
- siesta.TSDE:存储密度矩阵与能量密度矩阵。
- siesta.TSHS:存储哈密顿矩阵与重叠矩阵。
所有矩阵均可通过sisl Python包(sisl)读取,示例代码如下:
python
import sisl
matrix = sisl.get_sile("RUN.fdf").read_X()
其中X可替换为hamiltonian、overlap、density_matrix或energy_density_matrix。
若需复现论文中展示的计算结果,请参照graph2mat包(graph2mat)的官方文档(https://github.com/BIG-MAP/graph2mat)进行操作。
数据引用方式
------------------
https://doi.org/10.11583/DTU.c.7310005
© 2024 丹麦技术大学
授权协议
-----------------
本数据集采用CC BY 4.0协议发布。该协议允许使用者在注明原作者的前提下,以任意媒介或格式对本素材进行分发、改编、修改及二次创作。
提供机构:
Technical University of Denmark
创建时间:
2024-08-06



