five

Bacterial contaminants from a tardigrade genomic assembly

收藏
Figshare2016-01-23 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Bacterial_contaminants_from_a_tardigrade_genomic_assembly/2066925/1
下载链接
链接失效反馈
官方服务:
资源简介:
High-throughput sequencing provides a fast and cost effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results requires advanced bioinformatics approaches. In a recent study, Boothby et al. (doi: 10.1073/pnas.1510461112) reported the first draft genome for a tardigrade, <i>Hypsibius dujardini</i>, and detected a remarkable amount of genes with bacterial origin. Authors suggested that horizontal gene transfers (HGTs) could explain the unique ability of this microscopic animal to withstand extreme ranges of temperature, pressure, and radiation. However, a subsequent analysis by Koutsovoulos et al. (doi: 10.1101/033464) revealed extensive bacterial contamination in Boothby et al’s results, and reported a curated, yet much smaller genome for <i>H. dujardini</i>. Here, we re-analyzed the sequencing data generated by both groups using approaches routinely employed by microbial ecologists. We provide a holistic view of the draft tardigrade genome by displaying DNA and RNA data originating from 12 sequencing libraries, along with additional metadata, in one display. We identified and removed multiple near-complete bacterial genomes, and report a curated draft genome of 182 Mbp supported by RNA-Seq data. This genome contains only 3.9% of genes Boothby et al. identified to support the extended HGT hypothesis, yet is 34.8% longer than the curated genome reported by Koutsovoulos et al. Our results indicate that long read libraries introduced most contaminants, the type of bacterial contaminants varied between library preparations, and between sequencing efforts. Visualization and curation of eukaryotic genome assemblies can benefit from tools that are designed to address far more complex needs of today’s microbiologists who are constantly challenged by the environmental sequencing data. While the advent of high-throughput sequencing blurs the boundaries between different ends of life sciences, increased communication is essential to identify and disseminate best practices across disciplines.
创建时间:
2016-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作