futurehouse/lab-bench
收藏Hugging Face2025-09-27 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/futurehouse/lab-bench
下载链接
链接失效反馈官方服务:
资源简介:
语言代理生物学基准(LAB-Bench)是一个用于评估AI系统的数据集,旨在衡量生物学科学研究中的基础能力。该数据集目前包含8个主要类别,涵盖30个子任务,包括从科学文献中提取信息(LitQA2)、从数据库(DbQA)和补充信息(SuppQA)中检索信息、推理科学图表(FigQA)和表格(TableQA)、解决生物学协议问题(ProtocolQA)、操作生物序列(SeqQA)以及涉及分子克隆工作流程的特别困难的克隆场景(Cloning Scenarios)。该公共存储库包含约80%的完整数据集,保留20%的私有测试子集以监控训练污染。数据集还包括一个canary字符串,用于帮助模型构建者过滤未来的训练数据。
The Language Agent Biology Benchmark, or LAB-Bench, is an evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology. The dataset currently consists of 8 broad categories, comprising 30 narrower subtasks, including extracting information from the scientific literature (LitQA2), retrieving information from databases (DbQA) and supplementary information (SuppQA), reasoning about scientific figures (FigQA) and tables (TableQA), troubleshooting biological protocols (ProtocolQA), manipulating biological sequences (SeqQA), as well as a set of particularly difficult Cloning Scenarios involving capabilities common to molecular cloning workflows.
提供机构:
futurehouse
原始信息汇总
LAB-Bench 数据集概述
数据集基本信息
- 名称: LAB-Bench
- 许可证: CC BY-SA 4.0
- 大小类别: 1K < n < 10K
- 任务类别: 问答(Question Answering)
- 标签: 生物学(Biology)
数据集配置
CloningScenarios
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- subtask: string
- 分割:
- train:
- 字节数: 394060
- 样本数: 33
- train:
- 下载大小: 104821
- 数据集大小: 394060
DbQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- subtask: string
- 分割:
- train:
- 字节数: 621148
- 样本数: 520
- train:
- 下载大小: 171318
- 数据集大小: 621148
FigQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- subtask: string
- figure: image
- figure-path: string
- 分割:
- train:
- 字节数: 243738309.0
- 样本数: 181
- train:
- 下载大小: 165755937
- 数据集大小: 243738309.0
LitQA2
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- tag: string
- version: string
- sources: sequence of string
- subtask: string
- key-passage: string
- 分割:
- train:
- 字节数: 180544
- 样本数: 199
- train:
- 下载大小: 100047
- 数据集大小: 180544
ProtocolQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- protocol: string
- subtask: string
- 分割:
- train:
- 字节数: 687586
- 样本数: 108
- train:
- 下载大小: 265657
- 数据集大小: 687586
SeqQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- subtask: string
- 分割:
- train:
- 字节数: 967022
- 样本数: 600
- train:
- 下载大小: 366537
- 数据集大小: 967022
SuppQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- source: string
- subtask: string
- paper-title: string
- 分割:
- train:
- 字节数: 39449
- 样本数: 82
- train:
- 下载大小: 26264
- 数据集大小: 39449
TableQA
- 特征:
- id: string
- question: string
- ideal: string
- distractors: sequence of string
- canary: string
- subtask: string
- tables: sequence of image
- table-path: sequence of string
- 分割:
- train:
- 字节数: 108530097.0
- 样本数: 244
- train:
- 下载大小: 106195079
- 数据集大小: 108530097.0



