five

Materialyze/matpes

收藏
Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Materialyze/matpes
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: bsd-3-clause task_categories: - graph-ml language: - en tags: - chemistry - materials size_categories: - 100K<n<1M dataset_info: config_name: default features: # ---- Composition / chemistry ---- - name: nsites dtype: int32 - name: elements sequence: string - name: nelements dtype: int32 - name: composition sequence: - name: element dtype: string - name: amount dtype: float64 - name: composition_reduced sequence: - name: element dtype: string - name: amount dtype: float64 - name: formula_pretty dtype: string - name: formula_anonymous dtype: string - name: chemsys dtype: string # ---- Cell-level scalars ---- - name: volume dtype: float64 - name: density dtype: float64 - name: density_atomic dtype: float64 # ---- Symmetry ---- - name: symmetry struct: - name: crystal_system dtype: string - name: symbol dtype: string - name: number dtype: int32 - name: point_group dtype: string - name: symprec dtype: float64 - name: angle_tolerance dtype: float64 - name: version dtype: string # ---- Pymatgen Structure ---- - name: structure struct: - name: '@module' dtype: string - name: '@class' dtype: string - name: charge dtype: float64 - name: lattice struct: - name: matrix sequence: sequence: float64 - name: pbc sequence: bool - name: a dtype: float64 - name: b dtype: float64 - name: c dtype: float64 - name: alpha dtype: float64 - name: beta dtype: float64 - name: gamma dtype: float64 - name: volume dtype: float64 - name: properties dtype: string - name: sites sequence: - name: species sequence: - name: element dtype: string - name: occu dtype: float64 - name: abc sequence: float64 - name: properties struct: - name: magmom dtype: float64 - name: label dtype: string - name: xyz sequence: float64 # ---- Labels (DFT targets) ---- - name: energy dtype: float64 - name: forces sequence: sequence: float64 - name: stress sequence: float64 # ---- Identifiers / derived properties ---- - name: matpes_id dtype: string - name: bandgap dtype: float64 - name: functional dtype: string - name: formation_energy_per_atom dtype: float64 - name: cohesive_energy_per_atom dtype: float64 - name: abs_forces sequence: float64 - name: bader_charges sequence: float64 - name: bader_magmoms sequence: float64 # ---- Provenance (MD sampling + MP origin) ---- - name: provenance struct: - name: original_mp_id dtype: string - name: materials_project_version dtype: string - name: md_ensemble dtype: string - name: md_temperature dtype: float64 - name: md_pressure dtype: float64 - name: md_step dtype: int32 - name: mlip_name dtype: string configs: - config_name: pbe data_files: - split: train path: MatPES-PBE-2025.2-charges.json - config_name: r2scan data_files: - split: train path: MatPES-R2SCAN-2025.2-charges.json - config_name: pbe-2025.2 data_files: MatPES-PBE-2025.2-charges.json - config_name: r2scan-2025.2 data_files: MatPES-R2SCAN-2025.2-charges.json - config_name: pbe-2025.1 data_files: MatPES-PBE-2025.1-charges.json - config_name: r2scan-2025.1 data_files: MatPES-R2SCAN-2025.1-charges.json - config_name: pbe-atoms data_files: MatPES-PBE-atoms.json - config_name: r2scan-atoms data_files: MatPES-R2SCAN-atoms.json papers: - 2503.04070 --- ## Dataset Description - **Homepage:** [matpes.ai](http://matpes.ai) - **Paper:** [A Foundational Potential Energy Surface Dataset for Materials](https://doi.org/10.48550/arXiv.2503.04070) - **Leaderboard:** [MatCalc-Benchmark](http://matpes.ai/benchmarks) - **Point of Contact:** [Materialyze] ### Dataset Summary Potential energy surface datasets with near-complete coverage of the periodic table are used to train foundation potentials (FPs), i.e., machine learning interatomic potentials (MLIPs) with near-complete coverage of the periodic table. MatPES is an initiative by the [Materialyze] Lab and the [Materials Project] to address [critical deficiencies](http://matpes.ai/about) in such PES datasets for materials. 1. **Accuracy.** MatPES is computed using static DFT calculations with stringent converegence criteria. Please refer to the `MatPESStaticSet` in [pymatgen] for details. 2. **Comprehensiveness.** MatPES structures are sampled using a 2-stage version of DImensionality-Reduced Encoded Clusters with sTratified [DIRECT](https//doi.org/10.1038/s41524-024-01227-4) sampling from a greatly expanded configuration of MD structures. 3. **Quality.** MatPES includes computed data from the PBE functional, as well as the high fidelity r2SCAN meta-GGA functional with improved description across diverse bonding and chemistries. The initial v2025.1 release comprises ~400,000 structures from 300K MD simulations. This dataset is much smaller than other PES datasets in the literature and yet achieves comparable or, in some cases, [improved performance and reliability](http://matpes.ai/benchmarks) on trained FPs. MatPES is part of the MatML ecosystem, which includes the [MatGL] (Materials Graph Library) and [maml] (MAterials Machine Learning) packages, the [MatPES] (Materials Potential Energy Surface) dataset, and the [MatCalc] (Materials Calculator). [Materialyze]: http://materialyze.ai [Materials Project]: https://materialsproject.org [M3GNet]: http://dx.doi.org/10.1038/s43588-022-00349-3 [CHGNet]: http://doi.org/10.1038/s42256-023-00716-3 [TensorNet]: https://arxiv.org/abs/2306.06482 [maml]: https://materialsvirtuallab.github.io/maml/ [MatGL]: https://matgl.ai [MatPES]: https://matpes.ai [MatCalc]: https://matcalc.ai
提供机构:
Materialyze
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作