BIMAGES: Bivalve images for morphological analysis and genetic estimation study
收藏DataONE2024-07-25 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:ca2478db2592f456788e56563d0cefa5eec650303c18258949dbb80b8e1b83ee
下载链接
链接失效反馈官方服务:
资源简介:
Reconstructing the tree of life and understanding the relationships of taxa are core questions in evolutionary and systematic biology. The main advances in this field in the last decades were derived from molecular phylogenetics; however, for most species, molecular data are not available. Here, we explore the applicability of two deep learning methods â supervised classification approaches and unsupervised similarity learning â to infer organism relationships from specimen images. As a basis, we assembled an image dataset covering 4144 bivalve species belonging to 74 families across all orders and subclasses of the extant Bivalvia, with molecular phylogenetic data being available for all families and a complete taxonomic hierarchy for all species. The suitability of this dataset for deep learning experiments was evidenced by an ablation study resulting in almost 80% accuracy for identifications on the species level. Three sets of experiments were performed using our dataset. First, we ..., The image dataset was obtained from three main sources: data aggregation platforms such as GBIF and iDigBio, natural history museums, and websites of shell dealers and private enthusiasts (see Appendix in manuscript).
To maximize the images' information density and reduce noise and potential bias caused by objects other than bivalves, all images were subject to an automated image segmentation process to decompose them into individual items. Only images showing the inner or outer lateral side of the shells were kept. When necessary, images were rotated into the correct scientific position with the hinge line up to the best possible extent by steps of 90°.
One of the authors (SK) evaluated the identification of each image based on his taxonomic expertise and removed all images considered as incorrectly identified. To update the taxonomic assignment of each species and to re-assign synonymized names, all names were checked against the World Register of Marine Species (WoRMS), and each imag..., , # BIMAGES: Bivalve images for morphological analysis and genetic estimation study
## Inferring Taxonomic Affinities and Genetic Distances Using Morphological Features Extracted from Specimen Images: a Case Study with a Bivalve dataset
This preprocessed fine-grained labeled dataset contains 71,888 images of 4,144 species in 884 genera, 74 families, 26 orders, and six subclasses; the phylogenetic study by Bieler et al. (2014) covers all 74 families.
## Description of the data and file structure
Metadata and labels are located inside the meta.tsv file. The code is located in the respective code folder.
## Sharing/Access information
All images can be accessed through the h5 database files.
Data was derived from the following sources:
| Source | Website | N of images | | |
| -------------------------...
重建生命之树并厘清类群间的亲缘关系,是进化生物学与系统生物学的核心研究命题。近数十年来,该领域的主要进展源自分子系统发育学;然而,绝大多数物种尚未获取分子数据。本研究探讨了两种深度学习方法——监督分类方法与无监督相似性学习——从标本图像推断生物亲缘关系的适用性。为此,我们构建了一套图像数据集,涵盖现存双壳纲(Bivalvia)所有目与亚纲下74个科的4144个双壳类物种,所有科均配有分子系统发育数据,所有物种均具备完整的分类层级。消融实验证实了该数据集适用于深度学习实验,物种级识别准确率接近80%。本研究基于该数据集开展了三组实验:其一,我们……
本图像数据集主要来源于三大渠道:GBIF、iDigBio等数据聚合平台、自然历史博物馆,以及贝类经销商与私人爱好者的网站(详见论文附录)。
为最大化图像信息密度并减少非双壳类物体带来的噪声与潜在偏倚,所有图像均经过自动化图像分割处理,以分解为独立个体。仅保留展示贝壳内、外侧表面的图像。必要时,我们将图像以90°为步长旋转至尽可能符合科学规范的位置(铰合线朝上)。
本文作者之一(SK)凭借其分类学专业知识对每张图像的识别结果进行了评估,并移除了所有识别错误的图像。为更新每个物种的分类归属并重新处理异名,我们参照世界海洋物种登记册(WoRMS)对所有名称进行了核查,并为每张图像……,,# BIMAGES:用于形态学分析与遗传估计研究的双壳类图像数据集
## 基于标本图像提取的形态特征推断分类亲缘关系与遗传距离:以双壳类数据集为例
该经过预处理的细粒度标注数据集包含71888张图像,隶属于6个亚纲、26个目、74个科、884个属,共计4144个物种;Bieler等人2014年的系统发育研究涵盖了全部74个科。
## 数据与文件结构说明
元数据与标签存储于meta.tsv文件中,代码位于对应的代码文件夹内。
## 共享与获取说明
所有图像均可通过h5数据库文件获取。
本数据集来源于以下渠道:
| 来源名称 | 网站链接 | 图像数量 | | |
| -------------------------...
创建时间:
2024-07-26



