Marks-lab/PromoterZoo
收藏Hugging Face2025-09-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/Marks-lab/PromoterZoo
下载链接
链接失效反馈官方服务:
资源简介:
PromoterZoo数据集包含来自不同物种的约1360万个启动子序列,这些序列可用于训练基因序列模型。序列长度为1000个碱基对,代表启动子区域。数据集涵盖了多个进化枝的物种,并为每个序列提供了基因符号、物种名称、进化枝、染色体、链方向、基因组起始位置和终止位置等信息。
The PromoterZoo dataset contains approximately 13.6 million promoter sequences from various species, which can be used for training genomic sequence models. Each sequence is 1000 base pairs long, representing the promoter region. The dataset spans multiple evolutionary clades and provides information such as gene symbol, species name, evolutionary clade, chromosome, strand orientation, genomic start position, and stop position for each sequence.
提供机构:
Marks-lab



