five

RotconML: A theoretical dataset for machine learning of spectroscopic parameters

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4064088
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset comprises ~83,000 small organic molecules containing [H,C,O,N], with structures and harmonic frequency calculations performed at the ωB97X-D/6-31+G(d) level of theory with Gaussian '16. The purpose of this dataset is for training machine learning models—in particular, for use in rotational spectroscopy and identifying unknown molecules from spectroscopic parameters. Details of the model and the data can be found in this paper. This particular combination of electronic structure method and basis set was benchmarked in earlier work to provide relatively low uncertainties in the predicted rotational constants, and through a cancellation of errors, provides equilibrium constants that are extremely close to the vibrationally averaged (experimental) values. More details can be found in this paper.   The dataset is included as a comma-separated value (CSV) file, which can be a little difficult to parse as plain text; I recommend using the `pandas` Python package to parse and manipulate as a Dataframe instead. The columns of this dataset include: rotational constants, moments of inertia and derived values (such as inertial defect and asymmetry parameter), harmonic frequencies and intensities, dipole moments, zero-point energy, the electronic energy, the cartesian coordinates, the SMILES identifier, the final energy difference after optimization, and the molecular mass.   For more details, users are referred to our papers above and/or contact the author. If you are using this dataset for your research/work, please cite this Zenodo entry, and this reference: McCarthy, M.; Lee, K. L. K. Molecule Identification with Rotational Spectroscopy and Probabilistic Deep Learning. J. Phys. Chem. A 2020, 124 (15), 3002–3017. https://doi.org/10.1021/acs.jpca.0c01376.
创建时间:
2020-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作