neuralbioinfo/PhaStyle-SequenceDB
收藏Hugging Face2025-06-14 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/neuralbioinfo/PhaStyle-SequenceDB
下载链接
链接失效反馈官方服务:
资源简介:
PhaStyle-SequenceDB数据集包含标记有相应生活方式(致病或温和)的噬菌体序列。该数据集分为四个关键子集:BACPHLIP训练集、BACPHLIP验证集、EXTREMOPHILE集和ESCHERICHIA集。BACPHLIP训练集包含1868个非Escherichia噬菌体序列,用于模型训练。BACPHLIP验证集包含394个Escherichia coli噬菌体序列,用于验证模型性能。EXTREMOPHILE集包含来自极端环境的16个噬菌体序列。ESCHERICHIA集包含Guelin收藏和额外随机选择的100个高质量温和噬菌体。该数据集旨在帮助训练和评估用于噬菌体生活方式预测的基因模型。
The PhaStyle-SequenceDB dataset consists of phage sequences labeled with their corresponding lifestyles (virulent or temperate). It is split into four key subsets: the BACPHLIP training set with 1868 sequences from non-Escherichia phages for model training, the BACPHLIP validation set with 394 Escherichia coli sequences for performance testing, the EXTREMOPHILE set with 16 bacteriophages from extreme environments, and the ESCHERICHIA set with the Guelin collection and 100 additional high-quality temperate phages. This dataset is designed to aid in training and evaluating genomic models for phage lifestyle prediction.
提供机构:
neuralbioinfo



