five

Supporting data for "MinION nanopore sequencing of environmental metagenomes: a synthetic approach"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100278
下载链接
链接失效反馈
官方服务:
资源简介:
Environmental metagenomic analysis is typically accomplished by assigning taxonomy and/or function from whole genome sequencing (WGS) or 16S amplicon sequences. Both of these approaches are limited by read length and other technical and biological factors. A nanopore-based sequencing platform, MinION, produces reads that are 10000 bp in length, potentially providing for more precise assignment, thereby alleviating some of the limitations inherent in determining metagenome composition from short reads. We tested the ability of sequence data produced by MinION (R7.3 flow cells) to correctly assign taxonomy in single bacterial species runs and in three types of low complexity synthetic communities: a mixture of DNA using equal mass from four species, a community with one relatively rare (1%) and three abundant (33% each) components, and a mixture of genomic DNA from 20 bacterial strains of staggered representation. Taxonomic composition of the low-complexity communities was assessed by analyzing the MinION sequence data with three different bioinformatic approaches: Kraken, MG-RAST, and One Codex. Long read sequences generated from libraries prepared from single strains using the SQKMAP005 kit and chemistry, run on the original MinION device, yielded as few as 224 to as many as 3,497 bidirectional high-quality (2D) reads with an average overall study length of 6,000 bp. For the single-strain analyses, assignment of reads to the correct genus by different methods ranged from 53.1% to 99.5%, assignment to the correct species ranged from 23.9% to 99.5%, and the majority of mis-assigned reads were to closely related organisms. A synthetic metagenome sequenced with the same setup yielded 714 high quality 2D reads of approximately 5,500 bp that were up to 98% correctly assigned to the species level. Synthetic metagenomes from MinION libraries generated using the SQKMAP006 kit and chemistry yielded 899-3,497 2D reads with lengths averaging 5,700 bp with up to 98% assignment accuracy at the species-level. The observed community proportions for equal and rare synthetic libraries were close to the known proportions, deviating from 0.1 10% across all tests. For a 20-species mock community with staggered contributions, a sequencing run detected all but 3 species (each included at <0.05% of DNA in the total mixture); 91% of reads were assigned to the correct species, 93% of reads were assigned to the correct genus, and >99% of reads were assigned to the correct family.

环境宏基因组分析通常通过从全基因组测序(WGS)或16S扩增子序列中分配分类学(taxonomy)信息和/或功能来完成。这两种方法均受限于读长及其他技术与生物学因素。基于纳米孔的测序平台MinION™可产生长度≥10000 bp的读段,有望实现更精确的分类分配,从而缓解短读长测序在确定宏基因组(metagenome)组成时固有的部分局限性。 我们测试了MinION(R7.3流动池(flow cell))产生的序列数据在单一细菌物种测序及三种低复杂度合成群落(synthetic community)中的分类学正确分配能力:四种物种等质量DNA混合物、含一种相对稀有(1%)和三种丰富(各33%)组分的群落,以及20种细菌菌株梯度比例的基因组DNA混合物。 低复杂度群落的分类组成通过三种不同的生物信息学方法(Kraken、MG-RAST和One Codex)分析MinION序列数据来评估。 用SQK–MAP005试剂盒及化学方法从单一菌株制备的文库,在原始MinION设备上测序产生的长读长序列,得到224至3497条双向高质量(2D)读段,平均读长为6000 bp。在单一菌株分析中,不同方法将读段分配至正确属的比例为53.1%至99.5%,分配至正确种的比例为23.9%至99.5%,且大多数错误分配的读段属于亲缘关系密切的生物。 相同设置下测序的合成宏基因组产生714条约5500 bp的高质量2D读段,其中高达98%可正确分配至种水平。使用SQK–MAP006试剂盒及化学方法构建的MinION文库产生的合成宏基因组,得到899至3497条2D读段,平均读长5700 bp,种水平分配准确率高达98%。 "等质量"和"稀有"合成文库的观测群落比例与已知比例接近,所有测试中的偏差范围为0.1%至10%。对于具有梯度贡献的20物种模拟群落(mock community),一次测序运行检测到除3个物种外的所有物种(原文此处存在信息缺失),且99%的读段分配至正确科。
提供机构:
GigaScience Database
创建时间:
2017-02-06
二维码
社区交流群
二维码
科研交流群
商业服务