five

Integrating deep learning derived morphological traits and molecular data for total-evidence phylogenetics: lessons from digitized collections

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.9cnp5hqqq
下载链接
链接失效反馈
官方服务:
资源简介:
Deep learning has previously shown success in automatically generating morphological traits which carry a phylogenetic signal. In this paper, we explore combining molecular data with deep learning derived morphological traits from images of pinned insects to generate total-evidence phylogenies and we reveal challenges. Deep learning derived morphological traits, while informative, underperform when used in isolation compared to molecular analyses. However, they can improve molecular results in total evidence settings. We use a dataset of rove beetle images to compare the effect of different dataset splits and deep metric loss functions on morphological and total evidence results. We find a slight preference for the cladistic dataset split and contrastive loss function. Additionally, we explore the effect of varying the number of genes used in inference and find that different gene combinations provide the best results when used on their own vs in total evidence analysis. Despite the promising nature of integrating deep learning techniques with molecular data, challenges remain regarding the strength of the phylogenetic signal and the resource demands of data acquisition. We suggest that future work focus on improved trait extraction and the development of disentangled networks to better interpret the derived traits, thus expanding the applicability of these methods in phylogenetic studies. Methods Images of specimens associated with this dataset can be found in the Rove-Tree-11 dataset (https://doi.org/10.17894/ucph.39619bba-4569-4415-9f25-d6a0ff64f0e3). Molecular data was gathered from Genbank and aligned using MAFFT 7. Original Genbank accession numbers are provided. Alignments were concatenated with FASconCAT-G. Partition scheme and model selection were obtained using PartitionFinder 2.1.1.  Trees were obtained via Maximum Parsimony using TNT. Example TNT scripts are provided.
创建时间:
2024-12-18
二维码
社区交流群
二维码
科研交流群
商业服务