Bacterial contaminants from a tardigrade genomic assembly
收藏Figshare2016-01-23 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Bacterial_contaminants_from_a_tardigrade_genomic_assembly/2066925/1
下载链接
链接失效反馈官方服务:
资源简介:
High-throughput sequencing provides a fast and cost effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results requires advanced bioinformatics approaches. In a recent study, Boothby et al. (doi: 10.1073/pnas.1510461112) reported the first draft genome for a tardigrade, <i>Hypsibius dujardini</i>, and detected a remarkable amount of genes with bacterial origin. Authors suggested that horizontal gene transfers (HGTs) could explain the unique ability of this microscopic animal to withstand extreme ranges of temperature, pressure, and radiation. However, a subsequent analysis by Koutsovoulos et al. (doi: 10.1101/033464) revealed extensive bacterial contamination in Boothby et al’s results, and reported a curated, yet much smaller genome for <i>H. dujardini</i>. Here, we re-analyzed the sequencing data generated by both groups using approaches routinely employed by microbial ecologists. We provide a holistic view of the draft tardigrade genome by displaying DNA and RNA data originating from 12 sequencing libraries, along with additional metadata, in one display. We identified and removed multiple near-complete bacterial genomes, and report a curated draft genome of 182 Mbp supported by RNA-Seq data. This genome contains only 3.9% of genes Boothby et al. identified to support the extended HGT hypothesis, yet is 34.8% longer than the curated genome reported by Koutsovoulos et al. Our results indicate that long read libraries introduced most contaminants, the type of bacterial contaminants varied between library preparations, and between sequencing efforts. Visualization and curation of eukaryotic genome assemblies can benefit from tools that are designed to address far more complex needs of today’s microbiologists who are constantly challenged by the environmental sequencing data. While the advent of high-throughput sequencing blurs the boundaries between different ends of life sciences, increased communication is essential to identify and disseminate best practices across disciplines.
创建时间:
2016-01-23



