five

modelforge curated dataset: ANI-1x

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12176579
下载链接
链接失效反馈
官方服务:
资源简介:
Curated ANI-1x Dataset: Full dataset, version "full_dataset_v0": This provides a curated hdf5 file for the ANI-1x dataset designed to be compatible with modelforge, an infrastructure to implement and train NNPs.  This dataset contains 4,956,005 total conformers, for 3114 unique entries.  When applicable, the units of properties are provided in the datafile,  encoded as strings compatible with the openff-units package.  For more information about the structure of the data file, please see the following: https://github.com/choderalab/modelforge/wiki/Dataset-and-curation#curation-module This curated dataset was generated using the modelforge software at commit c5c7153: Link to the source code at this commit: https://github.com/choderalab/modelforge/tree/c5c7153e06172fe8e6f25015250ecb5db05655cc Link to the script file used to generate the dataset: https://github.com/choderalab/modelforge/blob/c5c7153e06172fe8e6f25015250ecb5db05655cc/modelforge/curation/scripts/curate_ani1x.py    Source Dataset: The ANI-1x data set includes properties for small organic molecules that contain H, C, N, and O. This dataset contains nearly 5 million conformers. This data was generated with the wB97X/631Gd level of theory calculated using Gaussian 09. A subset of the the conformers (~500K) with accurate coupled cluster methods (ANI-1xcc). Citations: ANI-1x publications: ANI-1x dataset Smith, J. S.; Nebgen, B.; Lubbers, N.; Isayev, O.; Roitberg, A. E. Less Is More: Sampling Chemical Space with Active Learning. J. Chem. Phys. 2018, 148 (24), 241733. https://doi.org/10.1063/1.5023802 ANI-1ccx dataset Smith, J. S.; Nebgen, B. T.; Zubatyuk, R.; Lubbers, N.; Devereux, C.; Barros, K.; Tretiak, S.; Isayev, O.; Roitberg, A. E. Approaching Coupled Cluster Accuracy with a General-Purpose Neural Network Potential through Transfer Learning. Nat. Commun. 2019, 10 (1), 2903. https://doi.org/10.1038/s41467-019-10827-4 wB97x/def2-TZVPP data Zubatyuk, R.; Smith, J. S.; Leszczynski, J.; Isayev, O. Accurate and Transferable Multitask Prediction of Chemical Properties with an Atoms-in-Molecules Neural Network. Sci. Adv. 2019, 5 (8), eaav6490. https://doi.org/10.1126/sciadv.aav6490   Source dataset, released with CCO 1.0 Universal License: Smith, Justin S; Zubatyuk, Roman; Nebgen, Benjamin; Lubbers, Nicholas; Barros, Kipton; Roitberg, Adrian; et al. (2020). The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. figshare. Collection. https://doi.org/10.6084/m9.figshare.c.4712477.v1 Github repository: https://github.com/aiqm/ANI1x_datasets
创建时间:
2024-06-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作