Dictionary of 140k GDB and ZINC derived AMONs
收藏DataCite Commons2026-03-12 更新2026-05-04 收录
下载链接:
https://archive.materialscloud.org/doi/10.24435/materialscloud:fr-d2
下载链接
链接失效反馈官方服务:
资源简介:
We present all AMONs for GDB and Zinc data-bases using no more than 7 non-hydrogen atoms (AGZ7)---a calculated organic chemistry building-block dictionary based on the AMON approach [Huang and von Lilienfeld, Nature Chemistry (2020)]. AGZ7 records Cartesian coordinates of compositional and constitutional isomers, as well as properties for ∼140k small organic molecules obtained by systematically fragmenting all molecules of Zinc and the majority of GDB17 into smaller entities, saturating with hydrogens, and containing no more than 7 heavy atoms (excluding hydrogen atoms). AGZ7 cover the elements H, B, C, N, O, F, Si, P, S, Cl, Br, Sn and I and includes optimized geometries, total energy and its decomposition, Mulliken atomic charges, dipole moment vectors, quadrupole tensors, electronic spatial extent, eigenvalues of all occupied orbitals, LUMO, gap, isotropic polarizability, harmonic frequencies, reduced masses, force constants, IR intensity, normal coordinates, rotational constants, zero-point energy, internal energy, enthalpy, entropy, free energy, and heat capacity (all at ambient conditions) using B3LYP/cc-pVTZ (pseudopotentials were used for Sn and I) level of theory. We exemplify the usefulness of this data set with AMON based machine learning models of total potential energy predictions of seven of the most rigid GDB-17 molecules.
本研究公开了适用于GDB与Zinc数据库、含不超过7个非氢原子的所有AMON(AMON)数据集AGZ7(AGZ7)——这是一套基于AMON方法[Huang与von Lilienfeld,《自然·化学》(2020)]构建的计算有机化学砌块字典。AGZ7记录了组成异构体与构造异构体的笛卡尔坐标,同时包含约14万个小型有机分子的各类物性数据;该数据集通过系统性拆分Zinc数据库与绝大多数GDB17数据库中的分子为更小单元,用氢原子饱和后得到,且所含重原子(不包括氢原子)数量不超过7个。AGZ7涵盖H、B、C、N、O、F、Si、P、S、Cl、Br、Sn及I等元素,包含基于B3LYP/cc-pVTZ理论级别(其中Sn与I使用赝势)计算得到的优化几何结构、总能量及其分解项、马利肯原子电荷、偶极矩矢量、四极矩张量、电子空间延展度、所有占据轨道的本征值、LUMO、能隙、各向同性极化率、简谐振动频率、约化质量、力常数、红外强度、简正坐标、转动常数、零点能、内能、焓、熵、自由能及热容(所有物性均取自环境条件下的数据)。本研究通过基于AMON的机器学习模型,对7个刚性最强的GDB-17分子的总势能进行预测,以此演示该数据集的应用价值。
提供机构:
Materials Cloud
创建时间:
2025-06-24



