ORTHOSKIM PhyloNorway Samples
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP133089
下载链接
链接失效反馈官方服务:
资源简介:
Low-coverage whole genome shotgun sequencing (or genome skimming) has emerged as an easy and cost-effective method for acquiring numerous genomic data in non-model organisms, with many applications for phylogenetic or barcoding analyses. This method allows assembling chloroplast genomes (cpDNA), mitochondrial genome (mtDNA) and ribosomal regions (rDNA), which are over-represented within cells. However, numerous bioinformatic challenges remain to easily and rapidly obtain such data in organisms with complex genomic structures and rearrangements, in particular for mtDNA in plants, or cpDNA in some plant families. Here we introduce ORTHOSKIM, a user-friendly pipeline, designed to perform in silico capture of targeted sequences from genomic and transcriptomic libraries, by following three steps: untargeted (i.e. global) assembly, mapping against sequence references and sequence extraction integrating many check-up stages. Different modes are implemented to capture cpDNA, mtDNA and rDNA sequences along with nuclear sequences (nuDNA) or single copy orthologs (BUSCO datasets). Moreover, phylogenetic matrices can be directly produced by performing multiple alignments of captured sequences across study libraries. To highlight its use and its reliability, ORTHOSKIM was assessed here by using 114 genome skimming libraries and 4 RNAseq libraries in Primulaceae and Ericaceae, the latter being a well-known problematic family for cpDNA assemblies. The pipeline was able to recover with high success rates cpDNA, mtDNA and rDNA sequences, which appeared efficient to produce character matrices well suited to infer phylogenetic relationships within these families. ORTHOSKIM is freely released under a GPL-3 license available at: https://github.com/cpouchon/ORTHOSKIM.
创建时间:
2021-12-03



