Data from: BioEncoder: a metric learning toolkit for comparative organismal biology
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10909613
下载链接
链接失效反馈官方服务:
资源简介:
BioEncoder: a metric learning toolkit for comparative organismal biology
Abstract - In the realm of biological image analysis, deep learning (DL) has become a core toolkit, e.g., for segmentation and classification. However, conventional DL methods are challenged by large biodiversity datasets characterized by unbalanced classes and hard-to-distinguish phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes these challenges by focussing on learning relationships between individual data points rather than on the separability of classes. BioEncoder is released as a Python package, created for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and simple hyperparameter adjustments through text-based configuration files. The toolkit's significance lies in its potential to unlock new research avenues in biological image analysis while democratizing access to advanced deep metric learning techniques. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL pipelines and practical applications in biological research.
Dataset - This data repository includes two things: a snapshot of the BioEncoder package (BioEncoder-main.zip, version 1.0.0, downloaded from https://github.com/agporto/BioEncoder on 2024-07-19 at 17:20), and the damselfly dataset used for the case study presented in the paper (bioencoder_data.zip). The dataset archive also encompasses the configuration files and the final model checkpoints from the case study, as well as a script to reproduce the results and figures presented in the paper.
How to use - Get started by consulting the GithHub repository for information on how to install BioEncoder, then download the data archive and run the script. Some parts of the script can be executed using the model checkpoints, for orther parts the training rountine needs to be run.
创建时间:
2024-07-26



