Cosmopolites sordidus genome assemblies
收藏DataCite Commons2026-03-17 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.f1vhhmh2r
下载链接
链接失效反馈官方服务:
资源简介:
PacBio HiFi sequencing was employed in combination with metagenomic
binning to produce a high-quality reference genome of Cosmopolites
sordidus. We compared k-mer and alignment reference-based pre-binning and
post-binning approaches to remove contamination. We were also interested
to know if the post-binning approach had interspersed Bacterial
contamination within intragenic regions of Arthropoda-binned contigs. Our
analyses identified 3,433 genes that were composed with reads identified
as of putative bacterial origins. The pre-binning approach yielded a C.
sordidus genome of 1.07Gb genome composed of 3,089 contigs with 98.6% and
97.1% complete and single copy genome and protein BUSCO scores
respectively. In this paper, we demonstrate that in this case, the
pre-binning approach does not sacrifice assembly quality for more
stringent metagenomic filtering. We also determine post-binning allows for
increased intragenic contamination increased with increasing coverage, but
the frequency of gene contamination increased with lower coverage.
Finally, NCBI’s new FCS-GX program was used as a final post-assembly
classification approach and identified contamination in both pre- and
post-binning assemblies. This indicates that both pre- and post-binning
approaches are required to fully remove contamination. Future work should
focus on developing reference-free pre-binning approaches for HiFi reads
produced from eukaryotic-based metagenomic samples.
提供机构:
Dryad
创建时间:
2023-10-06



