BIMAGES: Bivalve images for morphological analysis and genetic estimation study
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.k6djh9wd0
下载链接
链接失效反馈官方服务:
资源简介:
Reconstructing the tree of life and understanding the relationships of taxa are core questions in evolutionary and systematic biology. The main advances in this field in the last decades were derived from molecular phylogenetics; however, for most species, molecular data are not available. Here, we explore the applicability of two deep learning methods – supervised classification approaches and unsupervised similarity learning – to infer organism relationships from specimen images. As a basis, we assembled an image dataset covering 4144 bivalve species belonging to 74 families across all orders and subclasses of the extant Bivalvia, with molecular phylogenetic data being available for all families and a complete taxonomic hierarchy for all species. The suitability of this dataset for deep learning experiments was evidenced by an ablation study resulting in almost 80% accuracy for identifications on the species level. Three sets of experiments were performed using our dataset. First, we included taxonomic hierarchy and genetic distances in a supervised learning approach to obtain predictions on several taxonomic levels simultaneously. Here, we stimulated the model to consider features shared between closely related taxa to be more critical for their classification than features shared with distantly related taxa, imprinting phylogenetic and taxonomic affinities into the architecture and training procedure. Second, we used transfer learning and similarity learning approaches for zero-shot experiments to identify the higher-level taxonomic affinities of test species that the models had not been trained on. The models assigned the unknown species to their respective genera with approximately 48% and 67% accuracies. Lastly, we used unsupervised similarity learning to infer the relatedness of the images without prior knowledge of their taxonomic or phylogenetic affinities. The results indicated a reasonable similarity between visual appearance and genetic relationships at the higher taxonomic levels. The correlation was 0.6 for the most species-rich subclass, the Imparidentia, and ranged from 0.5 to 0.7 for the orders with the most images. Overall, the correlation between visual similarity and genetic distances at the family level was 0.78. However, fine-grained reconstructions based on the observed correlation, such as sister-taxa relationships, require further work. Overall, our results broaden the applicability of automated taxon identification systems and provide a new avenue for estimating phylogenetic relationships from specimen images.
Methods
The image dataset was obtained from three main sources: data aggregation platforms such as GBIF and iDigBio, natural history museums, and websites of shell dealers and private enthusiasts (see Appendix in manuscript).
To maximize the images' information density and reduce noise and potential bias caused by objects other than bivalves, all images were subject to an automated image segmentation process to decompose them into individual items. Only images showing the inner or outer lateral side of the shells were kept. When necessary, images were rotated into the correct scientific position with the hinge line up to the best possible extent by steps of 90°.
One of the authors (SK) evaluated the identification of each image based on his taxonomic expertise and removed all images considered as incorrectly identified. To update the taxonomic assignment of each species and to re-assign synonymized names, all names were checked against the World Register of Marine Species (WoRMS), and each image was labeled according to the currently accepted name and taxonomic hierarchy (species, genus, and family) indicated by WoRMS. Images of species not found in WoRMS were removed.
The assignment to the taxonomic level of the order follows WoRMS, and if no order was available, the superfamily indicated in WoRMS was used instead. Our assignment to a subclass does not follow WoRMS; instead, we applied the more traditional classification into Protobranchia, Pteriomorphia, Palaeoheterodonta, Archiheterodonta, Anomalodesmata, and Imparidentia, used in many recent phylogenetic publications on the Bivalvia.
创建时间:
2024-07-25



