Probe sequence and target sequence files for the Bromeliad1815 probe set
收藏DataCite Commons2025-10-29 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Probe_sequence_and_target_sequence_files_for_the_Bromeliad1815_probe_set/30472904
下载链接
链接失效反馈官方服务:
资源简介:
We developed the Bromeliad1815 bait set to capture low-copy nuclear genes across Bromeliaceae. The files in this repository include the probe sequences to order the probe kit and the target sequences file that can be used for HybPiper or other analysis software. We modified a bait kit adapted from the Bromeliad1776 kit (Yardeni et al., 2022), which itself was based on whole-genome assemblies of <i>Ananas</i> (Ming et al., 2015) and <i>Tillandsia</i> (de La Harpe et al., 2020). To capture genomic diversity across all bromeliad subfamilies, seven additional bromeliad genomes were sequenced by Novogene using paired-end 150-bp Illumina NovaSeq reads, We assembled genomes using MarSuCa version 4.0.9 (Zimin et al., 2013) for <i>Brocchinia acuminata</i>, <i>B. paniculata</i>, and <i>B. reducta</i> (Brocchinioideae), <i>Lindmania </i><i>longipes</i> (Lindmanioideae)<i>, Hechtia lundelliorum</i> (Hechtioideae), <i>Navia splendens</i> (Navioideae), and <i>Pitcairnia atrorubens</i> (Pitcairnioideae). Assembled genome sizes ranged from 278.0 to 364.8 Mb, with estimated coverages of 64.1-103.9x. We also included published genome sequences for <i>Puya raimondii</i> (Liu et al., 2021) of Puyoideae and <i>Ananas comosus </i>(Ming et al., 2015) of Bromelioideae. Although we did not include any tillandsioid sequences, the original Bromeliad1776 bait kit included extensive sampling from <i>Tillandsia</i> (Yardeni et al., 2022).Using HybPiper version 1.3 (Johnson et al., 2016), we assembled the target sequences for the 1776 genes identified by Yardeni et al. (2022) from raw reads cleaned by Fastp version 0.12.4 (Chen et al., 2018). Sequences of each gene were aligned using MAFFT version 7.490 (Katoh et al., 2002) and trimmed using TrimAl version 1.4.1 (Capella-Gutiérrez et al., 2009). We then inferred individual gene trees for each gene across the nine WGS species as well as the original <i>Ananas</i> reference using maximum likelihood in IQ-Tree version 2.0.7 (Minh et al. 2020a), collapsing branches with ultrafast bootstrap support (UF BS) below 70% using GoTree version 0.4.4 (Lemoine & Gascuel 2021). We discarded 170 loci that produced unresolved trees as uninformative; 1305 loci were considered informative because they yielded trees with more than four nodes with ≥ 90% UF BS. We calculated pairwise difference scores using Mothur version 1.46.1 (Schloss et al., 2009) for assembled target sequences at the locus level. We then selected 351 target sequences that exhibited ≥ 15% divergence from the corresponding sequences for <i>Ananas comosus</i>, which was used as the basis for the Bromeliad1776 kit design. According to Arbor Biosciences (2020), baits can capture target sequences with up to 10-15% divergence if capture is conducted at 62°C. We then identified a total of 209 overlapping sequences between the 1305 informative loci and the 351 sequences even more divergent from <i>Ananas</i>. Those 209 overlapping sequences were added into our modified kit; 168 of these came from <i>Brocchinia</i>. Our new Bromeliad1815 bait kit comprises 1815 loci (57,000 baits) and is tailored to capture sequences more effectively beyond Bromelioideae and Tillandsioideae. The kit was tested and manufactured by Arbor Biosciences (Ann Arbor, MI, USA).
提供机构:
figshare
创建时间:
2025-10-29



