five

solvated protein fragments JCTC 2019

收藏
materials.colabfit.org2025-01-21 收录
下载链接:
https://materials.colabfit.org/id/DS_ctjgc03xdauc_0
下载链接
链接失效反馈
官方服务:
资源简介:
The solvated protein fragments dataset was generated as a partner benchmark dataset, along with SN2, for measuring the performance of machine learning models, in particular PhysNet, at describing chemical reactions, long-range interactions, and condensed phase systems. The dataset contains structures for all possible "amons" (hydrogen-saturated covalently bonded fragments) of up to eight heavy atoms (C, N, O, S) that can be derived from chemical graphs of proteins containing the 20 natural amino acids connected via peptide bonds or disulfide bridges. For amino acids that can occur in different charge states due to (de)protonation (i.e., carboxylic acids that can be negatively charged or amines that can be positively charged), all possible structures with up to a total charge of +-2e are included. In total, the dataset provides reference energies, forces, and dipole moments for 2,731,180 structures calculated at the revPBE-D3(BJ)/def2-TZVP level of theory using ORCA 4.0.1.

该溶剂化蛋白质片段数据集作为与SN2相配套的基准数据集生成,旨在评估机器学习模型,特别是PhysNet在描述化学反应、长程相互作用和凝聚相系统方面的性能。数据集包含了所有可能的“铵”结构(即氢饱和的共价键结合片段),这些结构可由包含20种天然氨基酸并通过肽键或二硫键连接的蛋白质的化学图推导得出,且片段中重原子(C、N、O、S)的数量最多可达八个。对于因(去)质子化而可能处于不同电荷状态的氨基酸(例如,可以带负电的羧酸或可以带正电的胺),数据集还包含了总电荷量为+-2e的所有可能结构。总计,该数据集提供了在revPBE-D3(BJ)/def2-TZVP理论水平上,由ORCA 4.0.1计算得到的2,731,180个结构的参考能量、力和偶极矩。
提供机构:
ColabFit
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作