Resolving phylogenetic relationships of a recently and rapidly evolving lineage from western North America (Mentzelia section Bartonia, Loasaceae)
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.sf7m0cghb
下载链接
链接失效反馈官方服务:
资源简介:
The landscape of western North America has dramatically transformed since the Miocene to become increasingly heterogeneous, in turn promoting the evolution of many rapidly radiating angiosperm lineages. Phylogenetic relationships of these recently and rapidly radiating groups are difficult to resolve as there is little genetic variation among species and a high degree of noise from incomplete lineage sorting and hybridization. Mentzelia section Bartonia (51 species; Loasaceae) exemplifies this problem well. The clade has been investigated with Sanger sequencing, RADSeq, and genome skimming methods, however, most species relationships remain elusive due to low genetic variability. To better infer species relationships, we applied a hybrid enrichment approach with the Angiosperms353 probe set and implemented a novel bioinformatics workflow that aimed to maximize phylogenetic signal and minimize noise from low-quality sequences, paralogy, and incomplete lineage sorting. With our phylogenomic approach, we found an increased resolution of species relationships compared to previous studies based on nrDNA loci. Although a few species relationships still do not have strong support, our results indicate that our methods were effective in phylogenetic inference of this recently and rapidly evolving lineage from western North America. To better characterize major groups in the Section, we propose the formal designation of three subsections: Decapetala, Multicaulis, and Multiflora.
Methods
1. Raw, paired-end reads were received from Illumina (raw sequences are available through NCBI)
2. Quality control of raw reads was performed with fastp (Chen et al. 2018)
3. Angiosperms353 loci (supercontigs [exons with flanking introns]) were assembled with HybPiper (Johnson et al. 2016)
4. Supercontigs were evaluated with HybPhaser (Nauheimer et al. 2021); the program removed low-quality samples and loci and generated consensus sequences with ambiguity codes
5. Sequences that were < 25% of the mean recovered length were removed with filter_by_length.py (https://github.com/mossmatters/phyloscripts/tree/master/HybPiperUtils)
6. Loci were aligned with MAFFT (Katoh and Standley 2013)
7. Outliers were removed from concatenated alignment with SpruceUp (Borowiec 2019)
8. Gene trees were inferred with IQ-TREE (Nguyen et al. 2015; Minh et al. 2022)
9. Long branches were removed with TreeShrink (Mai and Mirarab 2018)
10. Gene tree statistics for the 238-locus data set were calculated with SortaDate (Smith et al. 2018)
11. Gene trees were then filtered based on the results of step 10. Only loci with > average bipartition support were kept for a "108_locus" data set
12. An additional dataset was made based on a different subset of the 238-locus dataset that removed any locus with at least one paralogous sequence flagged by either HybPiper or HybPhaser; this generated the 75-locus dataset.
13. All three datasets had species trees inferred with ASTRAL-III (Zhang et al. 2018)
创建时间:
2025-02-05



