facebook/OMC25
收藏Hugging Face2025-12-11 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/facebook/OMC25
下载链接
链接失效反馈官方服务:
资源简介:
OMC25(Open Molecular Crystals 2025)是目前最大的高质量分子晶体DFT数据集。该数据集是在维也纳从头算模拟包(VASP)中采用PBE-D3理论级别生成的。OMC25包含了从Genarris 3.0生成的分子晶体弛豫轨迹中采样的结构,起始分子来自OE62数据集。数据集分为训练集和验证集,存储为LMDBDatabase对象。训练集包含24,870,226个结构,起始于207,271个分子晶体和44,403个分子,存储大小为139GB;验证集包含1,386,816个结构,起始于11,570个分子晶体和2,467个分子,存储大小为7.6GB。数据集还提供了所有初始分子晶体结构的详细信息,包括CSD参考代码、晶胞中的分子数、Genarris步骤、唯一晶体标识符、拆分信息、采样帧数、分子和晶体的组成、原子数、摩尔质量以及晶体空间群等。
OMC25 represents the largest high quality molecular crystal DFT dataset. OMC25 was generated at the PBE-D3 level of theory as implemented in Vienna Ab initio Simulation Package (VASP). OMC25 includes structures sampled from relaxation trajectories of molecular crystals generated by Genarris 3.0 starting from molecules in the OE62 dataset. The dataset is divided into training and validation splits, stored as LMDBDatabase objects. The training set contains 24,870,226 structures, starting from 207,271 molecular crystals and 44,403 molecules, with a storage size of 139GB. The validation set contains 1,386,816 structures, starting from 11,570 molecular crystals and 2,467 molecules, with a storage size of 7.6GB. The dataset also provides detailed information on all unique initial molecular crystal structures, including CSD reference codes, number of molecules in the unit cell, Genarris step, unique crystal identifier, split information, number of sampled frames, composition of molecule and crystal, number of atoms, molar mass, and crystal space group.
提供机构:
facebook



