five

Data fusion for integrative species identification using deep learning

收藏
DataONE2025-11-12 更新2025-11-22 收录
下载链接:
https://search.dataone.org/view/sha256:d65f92e99018b97c2d28027abb84f15ac946363035b788d13cdaf3314ef463e0
下载链接
链接失效反馈
官方服务:
资源简介:
DNA analyses have revolutionized species identification and taxonomic work. Yet, persistent challenges arise from little differentiation among species and considerable variation within species, particularly among closely-related groups. While images are commonly used as an alternative modality for automated identification tasks, their usability is limited by the same concerns. An integrative strategy, fusing molecular and image data through machine learning, holds significant promise for fine-grained species identification. However, a systematic overview and rigorous statistical testing concerning molecular and image preprocessing and fusion techniques, including practical advice for biologists, are missing so far. We introduce a machine learning scheme that integrates both molecular and morphological data for species identification. Initially, we systematically assess and compare three different DNA arrangement and two encoding methods. Later, artificial neural networks are used to ext..., , , # Data from: Data fusion for integrative species identification using deep learning [https://doi.org/10.5061/dryad.4qrfj6qjk](https://doi.org/10.5061/dryad.4qrfj6qjk) ## Description of the data and file structure ### Data The data folder contains the records and alignment files for each of the four datasets used in this study (i.e., Asteraceae, Poaceae, Coccinellidae, Lycaenidae). The text file contains the following information about the records: 'record_id' as a uniquely assigned custom ID for the record; 'species_name' is the name of the species; 'taxonomy' is the taxonomic information linked to the species provided by NCBI; 'genbank_accession' is the GenBank accession provided by NCBI and is included for completeness; 'image_url' is the original URL of the record's image; 'image_rights_holder' is the rights holder of the image if provided by GBIF. Sequences in the respective fasta files can be linked to their records via their unique record ID (e.g., 'BOLD642' in column 'record_...,
创建时间:
2025-11-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作