five

Bacterial training dataset for Galaxy training network tutorials on Genome assembly

收藏
Mendeley Data2024-03-27 更新2024-06-27 收录
下载链接:
https://zenodo.org/record/582600
下载链接
链接失效反馈
官方服务:
资源简介:
This training dataset is from an imaginary Staphylococcus aureus bacterium with a miniature genome. There is a reference genome in various formats as well as some fastq reads of a closely related but also imaginary mutant strain. It is a useful dataset for demonstrating: de novo genome assembly read mapping and variant calling genome annotation The files included are: wildtype.fna: the reference genome sequence of the wildtype strain in fasta format (a header line, then the nucleotide sequence of the genome.) wildtype.gff: the reference genome sequence of the wildtype strain in general feature format (a list of features - one feature per line, then the nucleotide sequence of the genome.) wildtype.gbk: the reference genome sequence in genbank format. mutant_R1.fastq and mutant_R2.fastq: Fastq sequence reads of a closely related mutant strain. The reads are paired-end. Each read is 150 bases long. The number of bases sequenced is equivalent to 19x the genome sequence of the wildtype strain. (Read coverage 19x - rather low!).
创建时间:
2023-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作