imageomics/VLM4Bio
收藏Hugging Face2026-01-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/imageomics/VLM4Bio
下载链接
链接失效反馈官方服务:
资源简介:
VLM4Bio是一个用于评估预训练视觉语言模型(VLMs)在生物图像中发现特征的基准数据集。该数据集包含鱼类、鸟类和蝴蝶三个生物分类群的图像,每个分类群约有1万张图像。支持的科学任务包括物种分类和特征识别、引用、定位和计数。数据集由469K个问答对组成,涉及约30K张图像,涵盖五个生物学相关任务。这些任务旨在测试VLMs在有机生物学中的不同方面表现,从预测准确性到利用已知生物特征的视觉线索进行推理的能力。数据集还包括开放式问题和多项选择题两种类型的问题。
VLM4Bio is a benchmark dataset of scientific question-answer pairs used to evaluate pretrained VLMs for trait discovery from biological images. VLM4Bio consists of images of three taxonomic groups of organisms: fish, birds, and butterflies, each containing around 10k images. Scientific tasks supported by this dataset include species classification and trait identification, referring, grounding, and counting. The dataset consists of 469K question-answer pairs involving around 30K images from three groups of organisms, covering five biologically relevant tasks. These tasks are designed to test different facets of VLM performance in organismal biology, ranging from measuring predictive accuracy to assessing their ability to reason about their predictions using visual cues of known biological traits. The dataset includes both open-ended questions and multiple-choice (MC) questions.
提供机构:
imageomics



