Data from: Assexon: assembling exon using gene capture data
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.b0r170c
下载链接
链接失效反馈官方服务:
资源简介:
Exon capture across species has been one of the most broadly applied
approaches to acquire multi-locus data in phylogenomic studies of
non-model organisms. Methods for assembling loci from short-read sequences
(eg, Illumina platforms) that rely on mapping reads to a reference genome
may not be suitable for studies comprising species across a wide
phylogenetic spectrum; thus, de novo assembling methods are more generally
applied. Current approaches for assembling targeted exons from short reads
are not particularly optimized as they cannot (1) assemble loci with low
read depth, (2) handle large files efficiently, and (3) reliably address
issues with paralogs. Thus, we present Assexon: a streamlined pipeline
that de novo assembles targeted exons and their flanking sequences from
raw reads. We tested our method using reads from Lepisosteus osseus
(4.37 Gb) and Boleophthalmus pectinirostris (2.43 Gb), which are captured
using baits that were designed based on genome sequence of Lepisosteus
oculatus and Oreochromis niloticus, respectively. We compared performance
of Assexon to PHYLUCE and HybPiper, which are commonly used pipelines to
assemble ultra-conserved element (UCE) and Hyb-seq data. A custom exon
capture analysis pipeline (CP) developed by Yuan et al was compared as
well. Assexon accurately assembled more than 3400 to 3800 (20%-28%) loci
than PHYLUCE and more than 1900 to 2300 (8%-14%) loci than HybPiper across
different levels of phylogenetic divergence. Assexon ran at least twice as
fast as PHYLUCE and HybPiper. Number of loci assembled using CP was
comparable with Assexon in both tests, while Assexon ran at least 7 times
faster than CP. In addition, some steps of CP require the user’s
interaction and are not fully automated, and this user time was not
counted in our calculation. Both Assexon and CP retrieved no paralogs in
the testing runs, but PHYLUCE and Hybpiper did. In conclusion, Assexon is
a tool for accurate and efficient assembling of large read sets from exon
capture experiments. Furthermore, Assexon includes scripts to filter
poorly aligned coding regions and flanking regions, calculate summary
statistics of loci, and select loci with reliable phylogenetic signal.
Assexon is available at https://github.com/yhadevol/Assexon.
提供机构:
Dryad
创建时间:
2019-09-26



