Genentech/compbiobench-data-v1
收藏Hugging Face2026-04-28 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/Genentech/compbiobench-data-v1
下载链接
链接失效反馈官方服务:
资源简介:
CompBioBench v1是一个包含100个多样化任务的基准测试,用于评估计算生物学中的代理系统。与数学和编程不同,生物数据本质上具有噪声且易于解释。为了在不将任务简化为规定性清单的情况下实现客观评估,我们提出了一种基于合成/增强数据和真实数据集元数据扰乱/擦洗的新基准构建策略,以创建具有单一真实答案的挑战性问题,这些问题需要多步推理、工具使用、定制代码和与真实世界外部资源的交互。该基准涵盖了基因组学、转录组学、表观基因组学、单细胞分析、人类遗传学和机器学习工作流程等多个领域。问题由领域专家策划,涵盖不同难度的广泛技能范围。
CompBioBench v1 is a benchmark of 100 diverse tasks for evaluating agentic systems in computational biology. Unlike mathematics and programming, which more readily admit systematic verification, biological data are inherently noisy and open to interpretation. To enable objective evaluation without reducing tasks to prescriptive checklists, we propose a new benchmark-construction strategy based on synthetic/augmented data and metadata scrambling/scrubbing of real datasets to create challenging problems with a single ground-truth answer that require multi-step reasoning, tool use, bespoke code, and interaction with real-world external resources. The benchmark spans genomics, transcriptomics, epigenomics, single-cell analysis, human genetics, and machine learning workflows. Questions are curated by domain experts to cover a broad range of skills with varying difficulty.
提供机构:
Genentech



