ufukkaraca/ody-bench

Name: ufukkaraca/ody-bench
Creator: ufukkaraca
Published: 2026-04-28 12:49:38
License: 暂无描述

Hugging Face2026-04-28 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/ufukkaraca/ody-bench

下载链接

链接失效反馈

官方服务：

资源简介：

Ody Bench是一个用于评估企业级AI代理可部署性的基准测试套件，涵盖了检索质量、跨源实体解析、矛盾检测、单步动作正确性、校准、多步工作流分解和安全敏感性请求处理等多个维度。该基准测试的目的是提供一个集成的、共享的语料库，以及一个信任调整的元度量，并诚实地披露包括负面结果在内的所有信息。

Ody Bench is a benchmark suite for evaluating the deployability of enterprise-grade AI Agents. It encompasses multiple evaluation dimensions such as retrieval quality, cross-source entity resolution, contradiction detection, single-step action correctness, calibration, multi-step workflow decomposition, and security-sensitive request handling. The objective of this benchmark is to provide an integrated and shared corpus, a trust-aligned meta-metric, and to transparently disclose all information including negative results.

提供机构：

ufukkaraca

5,000+

优质数据集

54 个

任务类型

进入经典数据集