vladak/string_ppi_human_1M
收藏Hugging Face2025-08-06 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/vladak/string_ppi_human_1M
下载链接
链接失效反馈官方服务:
资源简介:
STRING PPI Human 1M数据集包含来源于STRING数据库v11.5版本的1百万人蛋白质-蛋白质相互作用(PPIs)数据。数据集中的特征包括两个相互作用的蛋白质的氨基酸序列(seq_a和seq_b)、蛋白质名称(seq_name_a和seq_name_b)、一个0到1之间的归一化得分(score),以及一个表示交互是否为高置信度的二进制标签(label)。数据集经过过滤和预处理,适用于二进制PPI预测或基于嵌入的方法的训练。
The STRING PPI Human 1M dataset contains 1 million human protein-protein interactions (PPIs) derived from the STRING database version 11.5. The features in the dataset include amino acid sequences of the interacting proteins (seq_a and seq_b), protein names from STRING (seq_name_a and seq_name_b), a normalized score between 0 and 1 (score), and a binary label indicating high-confidence positive interactions (label). The dataset has been filtered and processed and is suitable for training binary PPI prediction or embedding-based methods.
提供机构:
vladak



