five

Cosmopolites sordidus genome assemblies

收藏
DataCite Commons2026-03-17 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.f1vhhmh2r
下载链接
链接失效反馈
官方服务:
资源简介:
PacBio HiFi sequencing was employed in combination with metagenomic binning to produce a high-quality reference genome of Cosmopolites sordidus. We compared k-mer and alignment reference-based pre-binning and post-binning approaches to remove contamination. We were also interested to know if the post-binning approach had interspersed Bacterial contamination within intragenic regions of Arthropoda-binned contigs. Our analyses identified 3,433 genes that were composed with reads identified as of putative bacterial origins. The pre-binning approach yielded a C. sordidus genome of 1.07Gb genome composed of 3,089 contigs with 98.6% and 97.1% complete and single copy genome and protein BUSCO scores respectively. In this paper, we demonstrate that in this case, the pre-binning approach does not sacrifice assembly quality for more stringent metagenomic filtering. We also determine post-binning allows for increased intragenic contamination increased with increasing coverage, but the frequency of gene contamination increased with lower coverage. Finally, NCBI’s new FCS-GX program was used as a final post-assembly classification approach and identified contamination in both pre- and post-binning assemblies. This indicates that both pre- and post-binning approaches are required to fully remove contamination. Future work should focus on developing reference-free pre-binning approaches for HiFi reads produced from eukaryotic-based metagenomic samples.
提供机构:
Dryad
创建时间:
2023-10-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作