five

ISO17 (ISO17 - MD Trajectories of C7O2H10 with total energies and atomic forces)

收藏
OpenDataLab2026-05-31 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/ISO17
下载链接
链接失效反馈
官方服务:
资源简介:
描述 这些分子是从 QM9 数据集 [1] 中最大的一组异构体中随机抽取的,该数据集由具有固定原子组成 (C7O2H10) 的分子组成,这些分子以不同的化学有效结构排列。它是 [2] 中使用的 ismoer MD 数据的扩展。该数据库是使用 Fritz-Haber Institute ab initio 模拟包 (FHI-aims) [3] 从分子动力学模拟中生成的。使用 Perdew-Burke-Ernzerhof (PBE) 泛函[4] 和 Tkatchenko-Scheffler (TS) van der 的广义梯度近似 (GGA) 中的标准量子化学计算方法密度泛函理论 (DFT) 进行了模拟华耳斯校正法[5]。该数据库由 129 个分子组成,每个分子包含 5,000 个构象几何形状、能量和力,分子动力学轨迹的分辨率为 1 飞秒。格式 数据以 ASE sqlite 格式存储,在总能量键下的总能量以 eV 为单位,在原子力键下的 atomic_forces 以 eV/Ang 为单位。以下 Python 代码片段遍历位于 path_to_db 的数据集的前 10 个条目: from ase.db import connect with connect(path_to_db) as conn: for row in conn.select(limit=10): print(row.toatoms() ) print(row['total_energy']) print(row.data['atomic_forces']) Partitions 数据按照 SchNet 论文 [6] 中使用的方式进行分区:reference.db - 80% 的 MD 轨迹的步长的 80% reference_eq.db - 这些分子的平衡构象 test_within.db - 剩余 20% 未见的参考轨迹步骤 test_other.db - 剩余 20% 未见 MD 轨迹 test_eq.db - 测试轨迹的平衡构象 在本文中,我们拆分了参考数据 ( reference.db)分成 400k 个训练示例和 4k 个验证示例。这些索引分别在文件 train_ids.txt 和 validation_idx.txt 中给出。基准模型 能量(内部)[eV] 力(内部)[eV/A] 能量(其他)[eV] 力(其他)[eV/A] SchNet [6] 0.016 0.043 0.104 0.095 下载 可在此处获取:data/iso17。 tar.gz (799.7 MB) 如何引用 使用此数据集时,请务必引用以下论文:KT Schütt, P.-J。 Kindermans、HE Sauceda、S. Chmiela、A. Tkatchenko、K.-R。穆勒。 SchNet:用于模拟量子相互作用的连续滤波器卷积神经网络。神经信息处理系统的进展。 2017. KT Schütt、F. Arbabzadah、S. Chmiela、KR Müller、A. Tkatchenko。来自深度张量神经网络的量子化学见解。 Nature Communications, 8, 13890. 2017. R. Ramakrishnan、PO Dral、M. Rupp 和 OA von Lilienfeld。 134 公斤分子的量子化学结构和性质。科学数据,2014 年 1 月。参考文献 [1] R. Ramakrishnan、PO Dral、M. Rupp 和 OA von Lilienfeld。 134 公斤分子的量子化学结构和性质。科学数据,2014 年 1 月。 [2] Schütt, KT, Arbabzadah, F., Chmiela, S., Müller, KR 和 Tkatchenko, A. (2017)。来自深度张量神经网络的量子化学见解。 Nature Communications, 8, 13890. [3] Blum, V.;格尔克,R。汉克,F。哈武,P。哈武,V。任,X。路透社,K。 Scheffler, M. Ab Initio Molecular Simulations with Numeric Atom-Centered Orbitals。计算。物理。交流。 2009, 180 (11), 2175–2196。 [4] 珀杜,JP;伯克,K。 Ernzerhof, M. 使广义梯度近似变得简单。物理。牧师莱特。 1996, 77 (18), 3865–3868。 [5] 特卡琴科,A.; Scheffler, M. 来自基态电子密度和自由原子参考数据的精确分子范德华相互作用。物理。牧师莱特。 2009, 102 (7), 73005. [6] Schütt, KT, Kindermans, PJ, Sauceda, HE, Chmiela, S., Tkatchenko, A., & Müller, KR SchNet:用于量子建模的连续滤波器卷积神经网络互动。神经信息处理系统的进展(接受)。 2017 年。

Description These molecules are randomly sampled from the largest set of isomers in the QM9 dataset [1], which comprises molecules with a fixed atomic composition (C7O2H10) arranged into distinct chemically valid structures. This is an extension of the isomer MD data utilized in [2]. The database was generated via molecular dynamics simulations using the Fritz-Haber Institute ab initio simulation package (FHI-aims) [3]. Simulations were performed using standard quantum chemistry computational methods based on Density Functional Theory (DFT) within the generalized gradient approximation (GGA) of the Perdew-Burke-Ernzerhof (PBE) functional [4] and the Tkatchenko-Scheffler (TS) van der Waals correction method [5]. The database consists of 129 molecules, each containing 5,000 conformational geometries, energies, and atomic forces, with a molecular dynamics trajectory resolution of 1 femtosecond. Format The data is stored in ASE sqlite format. The total energy is stored under the key `total_energy` in units of eV, while the atomic forces are stored under the key `atomic_forces` in units of eV/Angstrom. The following Python code snippet iterates over the first 10 entries of the dataset located at `path_to_db`: python from ase.db import connect with connect(path_to_db) as conn: for row in conn.select(limit=10): print(row.toatoms()) print(row["total_energy"]) print(row.data["atomic_forces"]) Partitions The data is partitioned following the methodology used in the SchNet paper [6]: - `reference.db`: 80% of the steps from all MD trajectories - `reference_eq.db`: Equilibrium conformations of these molecules - `test_within.db`: Remaining 20% of unseen reference trajectory steps - `test_other.db`: Remaining 20% of unseen full MD trajectories - `test_eq.db`: Equilibrium conformations of the test trajectories In this work, we split the reference dataset (`reference.db`) into 400,000 training examples and 4,000 validation examples. The corresponding indices are provided in the files `train_ids.txt` and `validation_idx.txt`, respectively. Benchmark Results | Metric | Internal Energy (eV) | Internal Force (eV/Å) | Other Energy (eV) | Other Force (eV/Å) | |---------------------------------|----------------------|-----------------------|-------------------|--------------------| | SchNet [6] | 0.016 | 0.043 | 0.104 | 0.095 | Download The dataset is available at: `data/iso17.tar.gz` (799.7 MB) How to Cite When utilizing this dataset, please cite the following publications: 1. K. T. Schütt, P.-J. Kindermans, H. E. Sauceda, S. Chmiela, A. Tkatchenko, K.-R. Müller. *SchNet: A continuous-filter convolutional neural network for modeling quantum interactions*. Advances in Neural Information Processing Systems. 2017. 2. K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R. Müller, A. Tkatchenko. *Quantum-chemical insights from deep tensor neural networks*. Nature Communications, 8, 13890. 2017. 3. R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld. *Quantum chemistry structures and properties of 134 kilo molecules*. Scientific Data, January 2014. References [1] R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld. *Quantum chemistry structures and properties of 134 kilo molecules*. Scientific Data, January 2014. [2] Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. (2017). Quantum-chemical insights from deep tensor neural networks. *Nature Communications*, 8, 13890. [3] Blum, V.; Gehrke, R.; Hanke, F.; Havu, P.; Havu, V.; Ren, X.; Reuter, K.; Scheffler, M. Ab Initio Molecular Simulations with Numeric Atom-Centered Orbitals. *Comput. Phys. Commun.* 2009, 180(11), 2175–2196. [4] Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. *Phys. Rev. Lett.* 1996, 77(18), 3865–3868. [5] Tkatchenko, A.; Scheffler, M. Accurate Molecular Van Der Waals Interactions from Ground-State Electron Density and Free-Atom Reference Data. *Phys. Rev. Lett.* 2009, 102(7), 073005. [6] Schütt, K. T.; Kindermans, P. J.; Sauceda, H. E.; Chmiela, S.; Tkatchenko, A.; Müller, K. R. *SchNet: A continuous-filter convolutional neural network for quantum interaction modeling*. Advances in Neural Information Processing Systems (Accepted). 2017.
提供机构:
OpenDataLab
创建时间:
2022-05-23
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
ISO17数据集是一个包含129个C7O2H10异构体分子的分子动力学轨迹数据集,每个分子有5,000个构象几何形状、能量和力数据,适用于量子化学和分子动力学模拟研究。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务