five

SaProtHub/Dataset-DLG4_RAT

收藏
Hugging Face2025-02-04 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/SaProtHub/Dataset-DLG4_RAT
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含蛋白质DLG4_RAT氨基酸序列的单点突变及其在深度突变扫描实验中对应的突变效应分数。蛋白质序列以单一氨基酸序列的格式表示。数据集分为训练集、验证集和测试集,分别包含1325、159和176个样本。标签代表每个突变氨基酸序列的适应度分数,基于观察到的每个氨基酸在选定的种群与未选定的种群中的频率的对数,相对于野生型。效应分数范围从负无穷大到正无穷大,分数大于0表示高适应度,小于0表示低适应度。

This dataset contains single site mutations of the protein DLG4_RAT amino acid sequence and their corresponding mutation effect scores from a deep mutation scanning experiment. The protein sequences are represented in the format of a single amino acid sequence. The dataset is split into training, validation, and test sets, containing 1325, 159, and 176 samples respectively. The labels represent the fitness score of each mutant amino acid sequence based on the frequency of observing each amino acid at each position in the selected versus the unselected population, relative to the wild type. The effect scores range from negative infinity to positive infinity, with scores greater than 0 indicating high fitness and scores less than 0 indicating low fitness.
提供机构:
SaProtHub
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作