Synthyra/clustered_ppi_string_dedup
收藏Hugging Face2026-02-04 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/Synthyra/clustered_ppi_string_dedup
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个蛋白质-蛋白质相互作用(PPIs)的变体,基于BIOGRID和STRING数据库构建。通过聚类蛋白质序列相似性,构建了训练/验证/测试集,确保在蛋白质水平上是独立的。每个变体都包含详细的统计信息和图表,如标签平衡、生物分布、序列长度分布等。数据集以压缩的pickle文件形式存储,并提供了下载和加载的辅助工具。
This dataset repo contains multiple dataset variants of protein–protein interactions (PPIs), built by clustering proteins by sequence similarity and then constructing train/valid/test splits that are intended to be disjoint at the protein level. Artifacts are stored as compressed pickles. Each variant includes detailed statistics and plots, such as label balance, organism distributions, and sequence length distributions. A helper downloader is provided for easy access to the data.
提供机构:
Synthyra



