colabfit/Open_Direct_Air_Capture_ODAC2025_Val_Full
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/colabfit/Open_Direct_Air_Capture_ODAC2025_Val_Full
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files: "co/*.parquet"
- config_name: info
data_files: "ds.parquet"
- config_name: configuration_sets
data_files: "cs/*.parquet"
- config_name: config_set_mapping
data_files: "cs_co_map/*.parquet"
license: cc-by-4.0
tags:
- molecular dynamics
- mlip
- interatomic potential
pretty_name: Open Direct Air Capture ODAC2025 Val Full
---
### <details><summary>Cite this dataset </summary>Sriram, A., Brabson, L. M., Yu, X., Choi, S., Abdelmaqsoud, K., Moubarak, E., Haan, P., Löwe, S., Brehmer, J., Kitchin, J. R., Welling, M., Zitnick, C. L., Ulissi, Z., Medford, A. J., and Sholl, D. S. _Open Direct Air Capture ODAC2025 Val Full_. ColabFit, 2025. https://doi.org/None</details>
#### This dataset has been curated and formatted for the ColabFit Exchange
#### This dataset is also available on the ColabFit Exchange:
https://materials.colabfit.org/id/DS_3hmdi49yv0er_0
#### Visit the ColabFit Exchange to search additional datasets by author, description, element content and more.
https://materials.colabfit.org
<br><hr>
# Dataset Name
Open Direct Air Capture ODAC2025 Val Full
### Description
The full (unfiltered) validation split of ODAC25.Open Direct Air Capture 2025 (ODAC25) is the largest high-quality DFT dataset for Direct Air Capture, containing over 15,000 Metal-Organic Frameworks (MOFs), including experimental, defective, synthetic, and amine-functionalized MOFs, with 4 adsorbates: CO2, H2O, N2, and O2. ODAC25 significantly improves upon ODAC23 by adding functionalized MOFs, new adsorbates (N2 and O2), higher k-point convergence, and re-relaxations of empty MOFs. The dataset contains three partitions: (1) mof_plus_adsorbate includes full DFT relaxations of different adsorbates on various MOFs; (2) mof includes re-relaxations of empty MOFs; (3) gcmc includes DFT single points of configurations derived from Grand Canonical Monte Carlo (GCMC) simulations.
### Dataset authors
Anuroop Sriram, Logan M. Brabson, Xiaohan Yu, Sihoon Choi, Kareem Abdelmaqsoud, Elias Moubarak, Pim de Haan, Sindy Löwe, Johann Brehmer, John R. Kitchin, Max Welling, C. Lawrence Zitnick, Zachary Ulissi, Andrew J. Medford, David S. Sholl
### Publication
https://doi.org/10.48550/arXiv.2508.03162
### Original data link
https://huggingface.co/facebook/ODAC25
### License
CC-BY-4.0
### Number of unique molecular configurations
1290240
### Number of atoms
286971239
### Elements included
Ag, Al, Bi, Br, C, Cd, Ce, Cl, Co, Cr, Cu, Eu, F, Fe, Gd, H, Hg, I, Mg, Mn, Mo, N, Na, Nb, Nd, Ni, O, P, Pr, S, Sc, Se, Si, Sm, Sr, Tb, Th, U, V, Y, Zn, Zr
### Properties included
energy, adsorption energy, atomic forces
<br>
<hr>
# Usage
- `ds.parquet` : Aggregated dataset information.
- `co/` directory: Configuration rows each include a structure, calculated properties, and metadata.
- `cs/` directory : Configuration sets are subsets of configurations grouped by some common characteristic. If `cs/` does not exist, no configurations sets have been defined for this dataset.
- `cs_co_map/` directory : The mapping of configurations to configuration sets (if defined).
<br>
#### ColabFit Exchange documentation includes descriptions of content and example code for parsing parquet files:
- [Parquet parsing: example code](https://materials.colabfit.org/docs/how_to_use_parquet)
- [Dataset info schema](https://materials.colabfit.org/docs/dataset_schema)
- [Configuration schema](https://materials.colabfit.org/docs/configuration_schema)
- [Configuration set schema](https://materials.colabfit.org/docs/configuration_set_schema)
- [Configuration set to configuration mapping schema](https://materials.colabfit.org/docs/cs_co_mapping_schema)
---
configs:
- config_name: default
data_files: "co/*.parquet(Parquet)"
- config_name: info
data_files: "ds.parquet(Parquet)"
- config_name: configuration_sets
data_files: "cs/*.parquet(Parquet)"
- config_name: config_set_mapping
data_files: "cs_co_map/*.parquet(Parquet)"
license: cc-by-4.0
tags:
- 分子动力学(molecular dynamics)
- 机器学习原子间势(Machine Learning Interatomic Potential,简称MLIP)
- 原子间势(interatomic potential)
pretty_name: "开放式直接空气捕获 ODAC2025 验证全集"
---
<details><summary>引用本数据集</summary>Sriram, A., Brabson, L. M., Yu, X., Choi, S., Abdelmaqsoud, K., Moubarak, E., Haan, P., Löwe, S., Brehmer, J., Kitchin, J. R., Welling, M., Zitnick, C. L., Ulissi, Z., Medford, A. J., and Sholl, D. S. _开放式直接空气捕获 ODAC2025 验证全集_. ColabFit, 2025. https://doi.org/None</details>
<br><hr>
#### 本数据集已针对ColabFit交换平台(ColabFit Exchange)进行整理与格式化处理
#### 本数据集也可在ColabFit交换平台获取:https://materials.colabfit.org/id/DS_3hmdi49yv0er_0
#### 访问ColabFit交换平台,可按作者、数据集描述、元素组成等条件检索其他数据集:https://materials.colabfit.org
<br><hr>
# 数据集名称
开放式直接空气捕获 ODAC2025 验证全集
### 数据集描述
ODAC25的完整(未过滤)验证划分集。开放式直接空气捕获2025(ODAC25)是目前规模最大的高质量直接空气捕获领域密度泛函理论(Density Functional Theory,DFT)数据集,涵盖超过15000种金属有机骨架材料(Metal-Organic Frameworks,MOFs),包括实验型、缺陷型、合成型以及氨基功能化MOFs,共包含4种吸附质:二氧化碳(CO₂)、水(H₂O)、氮气(N₂)和氧气(O₂)。
ODAC25相较于ODAC23实现了多项改进:新增功能化MOFs、两种新吸附质(N₂与O₂)、更高精度的k点收敛设置,以及对空MOFs的重新弛豫处理。
本数据集包含三个分区:(1) `mof_plus_adsorbate`:涵盖不同吸附质在各类MOFs上的完整DFT弛豫计算结果;(2) `mof`:包含空MOFs的重新弛豫结果;(3) `gcmc`:包含由巨正则蒙特卡洛(Grand Canonical Monte Carlo,GCMC)模拟得到的构型的单点DFT能量计算结果。
### 数据集作者
阿努鲁普·斯里拉姆、洛根·M·布拉布森、于晓涵、崔施勋、卡里姆·阿卜杜勒马克苏德、埃利亚斯·穆巴拉克、皮姆·德哈恩、辛迪·勒韦、约翰·布雷默、约翰·R·基钦、马克斯·韦林、C·劳伦斯·齐特尼克、扎卡里·乌鲁西、安德鲁·J·梅德福、戴维·S·肖尔
### 发表文献
https://doi.org/10.48550/arXiv.2508.03162
### 原始数据链接
https://huggingface.co/facebook/ODAC25
### 授权协议
CC-BY-4.0
### 唯一分子构型数量
1290240
### 原子总数
286971239
### 包含元素
银(Ag)、铝(Al)、铋(Bi)、溴(Br)、碳(C)、镉(Cd)、铈(Ce)、氯(Cl)、钴(Co)、铬(Cr)、铜(Cu)、铕(Eu)、氟(F)、铁(Fe)、钆(Gd)、氢(H)、汞(Hg)、碘(I)、镁(Mg)、锰(Mn)、钼(Mo)、氮(N)、钠(Na)、铌(Nb)、钕(Nd)、镍(Ni)、氧(O)、磷(P)、镨(Pr)、硫(S)、钪(Sc)、硒(Se)、硅(Si)、钐(Sm)、锶(Sr)、铽(Tb)、钍(Th)、铀(U)、钒(V)、钇(Y)、锌(Zn)、锆(Zr)
### 包含属性
能量、吸附能、原子受力
<br>
<hr>
# 使用说明
- `ds.parquet(Parquet)`:聚合后的数据集信息文件
- `co/` 目录:每个配置行均包含分子结构、计算属性与元数据
- `cs/` 目录:配置集是按共同特征分组的构型子集。若不存在`cs/`目录,则说明本数据集未定义任何配置集。
- `cs_co_map/` 目录:构型与配置集的映射关系文件(若已定义配置集)
<br>
#### ColabFit交换平台文档包含Parquet文件解析的相关说明与示例代码:
- [Parquet文件解析:示例代码](https://materials.colabfit.org/docs/how_to_use_parquet)
- [数据集信息架构](https://materials.colabfit.org/docs/dataset_schema)
- [构型架构](https://materials.colabfit.org/docs/configuration_schema)
- [配置集架构](https://materials.colabfit.org/docs/configuration_set_schema)
- [配置集-构型映射架构](https://materials.colabfit.org/docs/cs_co_mapping_schema)
提供机构:
colabfit



