five

Saccharomyces eubayanus strain:CBS12357 Genome sequencing. Saccharomyces eubayanus strain:CBS12357

收藏
NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA264003
下载链接
链接失效反馈
官方服务:
资源简介:
Two different libraries were prepared and sequenced using Illumina technology. The first sequence data set was a 100 cycle paired-end library with an expected insert size of 150bp. The overlapping read pairs data were merged into 3,293,656 single longer “pseudo” reads of 143bp ± 20.5, which represented a total amount of 471 Mb, and were subsequently assembled into 372 contigs with a size of 500 bp or longer resulting in a total sequence of 11.49 Mbp (Table 2). In a second assembly step, the 50-cycle mate pair library (17,521,927 pairs) with a 8-kb insert size representing 1.75 Gb was used to further structure the S. eubayanus CBS12357 strain genome sequence. Scaffolding enabled to step from 372 contigs to 76 scaffolds (Table 2) which led to a slightly larger haploid genome size of 11.9 Mb sequenced with a coverage depth of 185-fold. The obtained scaffolded genome sequence was annotated using the MAKER2 pipeline (Holt & Yandell, 2011) resulting in the annotation of 5238 ORFs.

本研究构建了两种不同的文库,并采用Illumina测序技术完成测序。第一组测序数据集为插入片段预期长度150bp的100循环双端测序(paired-end)文库。将该文库的重叠双端reads数据合并为3293656条长度为143bp±20.5的更长「伪」(pseudo)reads,总数据量达471 Mb;随后将这些reads组装为372条长度≥500bp的重叠群(contig),总序列长度为11.49 Mbp(见表2)。在第二轮组装步骤中,我们使用插入片段长度8kb、数据量1.75 Gb的50循环配对末端(mate pair)文库(共17521927条read对),对真贝酵母(Saccharomyces eubayanus)CBS12357菌株的基因组序列进行进一步结构化组装。通过支架组装(scaffolding),将重叠群(contig)数量从372个整合为76个支架序列(scaffold)(见表2),最终得到的单倍体基因组总长度略增至11.9 Mb,测序深度(coverage depth)达185倍。本研究采用MAKER2注释流程(pipeline)(Holt & Yandell, 2011)对获得的支架化基因组序列进行功能注释,最终注释得到5238个开放阅读框(ORF)。
创建时间:
2014-10-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作