five

colabfit/solvated_protein_fragments_JCTC_2019

收藏
Hugging Face2025-11-18 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/colabfit/solvated_protein_fragments_JCTC_2019
下载链接
链接失效反馈
官方服务:
资源简介:
solvated protein fragments JCTC 2019数据集是一个为测量机器学习模型描述化学反应、长程相互作用和凝聚相系统的性能而生成的基准数据集。该数据集包含了所有可能的由不超过八个重原子(C, N, O, S)组成的氢饱和共价键合片段(amons)的结构,这些片段可以由含有20种天然氨基酸的蛋白质的化学图通过肽键或二硫键连接而衍生。对于可能因质子化或去质子化而处于不同电荷状态的氨基酸,数据集包含了总电荷为+-2e的所有可能结构。数据集总共提供了2,731,180个结构在revPBE-D3(BJ)/def2-TZVP理论级别下使用ORCA 4.0.1计算出的参考能量、力和偶极矩。

The solvated protein fragments JCTC 2019 dataset is a benchmark dataset generated to measure the performance of machine learning models in describing chemical reactions, long-range interactions, and condensed phase systems. The dataset contains structures for all possible amons (hydrogen-saturated covalently bonded fragments) of up to eight heavy atoms (C, N, O, S) that can be derived from chemical graphs of proteins containing the 20 natural amino acids connected via peptide bonds or disulfide bridges. For amino acids that can exist in different charge states due to (de)protonation (i.e., carboxylic acids that can be negatively charged or amines that can be positively charged), all possible structures with a total charge of up to ±2e are included. The dataset provides reference energies, forces, and dipole moments for 2,731,180 structures calculated at the revPBE-D3(BJ)/def2-TZVP level of theory using ORCA 4.0.1.
提供机构:
colabfit
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作