five

ManagerBench

收藏
魔搭社区2026-01-02 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/ManagerBench
下载链接
链接失效反馈
官方服务:
资源简介:
# ManagerBench ManagerBench is a benchmark designed to evaluate the decision-making capabilities of large language models (LLMs) as they evolve from conversational assistants into autonomous agents. ManagerBench addresses a critical gap: assessing how models navigate real-world scenarios where the most effective path to achieving operational goals may conflicts with human safety. ## Dataset Details The benchmark evaluates models through realistic, human-validated managerial scenarios where agents must choose between: 1. Pragmatic but potentially harmful actions that successfully achieve operational goals. 2. Safe actions that protect human welfare but may lead to suboptimal operational outcomes. A parallel control set uses scenarios where potential harm targets only inanimate objects, allowing the ability to measure a model's pragmatism and identify when safety concerns may be overly conservative. ### Dataset Structure Each example in the dataset is structured as a list of three items: 1. **System Prompt**: Contains the scenario description and the operational goal. 2. **User Prompt**: Contains the available options for the agent to choose from. 3. **Metadata**: A string separated by "_" containing: - `domain` - The domain of the example. - `human_harm` - Different human harms. - `LLM_incentives` - Types of incentives for the AI to simulate realistic motivations. - `model`- The name of the model that generated the example. ### Parameterization The examples contain placeholders for **Harm Intensity** and **Benefit Intensity** that should be replaced with specific percentage values: - **Harm percentage**: Set at either 5% or 15%. - **Benefit percentage**: Set at either 10% or 50%. These parameters allow for systematic evaluation across different risk-reward trade-offs. ## Usage ```python from datasets import load_dataset dataset = load_dataset("AdiSimhi/managerbench") ``` ### Dataset Sources - **Code:** [https://github.com/technion-cs-nlp/ManagerBench ] - **Paper:** [https://arxiv.org/pdf/2510.00857 ] - **Website:** [https://technion-cs-nlp.github.io/ManagerBench-website/ ] **BibTeX:** @article{simhi2025managerbench, title={ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs}, author={Adi Simhi and Jonathan Herzig and Martin Tutek and Itay Itzhak and Idan Szpektor and Yonatan Belinkov}, eprint={2510.00857}, archivePrefix={arXiv}, url={https://arxiv.org/abs/2510.00857}, year={2025}}

# ManagerBench ManagerBench是一款专为评估大语言模型(Large Language Model,LLM)从对话式助手向自主AI智能体演进过程中的决策能力而设计的基准测试集。该基准填补了一项关键研究空白:即评估大语言模型如何在现实场景中进行决策——这类场景下,达成业务运营目标的最优路径可能与人类安全准则相冲突。 ## 数据集详情 该基准通过经过人类验证的真实管理类场景对模型进行评估,智能体需在以下两类选项中做出抉择: 1. 务实但存在潜在危害的行动:可成功达成业务运营目标,但可能引发安全风险; 2. 保障人类福祉的安全行动:可保护人类权益,但可能导致业务结果未达最优。 此外还设置了平行对照集,其场景中的潜在危害仅针对无生命物体,可用于量化模型的务实倾向,并识别出模型是否存在过度保守的安全顾虑。 ### 数据集结构 数据集中的每个样本均由三个元素组成的列表构成: 1. **系统提示词(System Prompt)**:包含场景描述与业务运营目标; 2. **用户提示词(User Prompt)**:列出智能体可供选择的所有行动选项; 3. **元数据(Metadata)**:由下划线分隔的字符串,包含以下字段: - `domain`:样本所属的领域; - `human_harm`:涉及的不同类型人类伤害风险; - `LLM_incentives`:用于模拟AI真实动机的激励类型; - `model`:生成该样本的模型名称。 ### 参数化设置 样本中包含**危害程度(Harm Intensity)**与**收益程度(Benefit Intensity)**的占位符,需替换为具体的百分比数值: - 危害百分比:取值固定为5%或15%; - 收益百分比:取值固定为10%或50%。 上述参数可支持针对不同风险-收益权衡场景的系统化评估。 ## 使用方法 python from datasets import load_dataset dataset = load_dataset("AdiSimhi/managerbench") ### 数据集来源 - **代码仓库**:[https://github.com/technion-cs-nlp/ManagerBench ] - **学术论文**:[https://arxiv.org/pdf/2510.00857 ] - **项目官网**:[https://technion-cs-nlp.github.io/ManagerBench-website/ ] **BibTeX引用格式:** bibtex @article{simhi2025managerbench, title={ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs}, author={Adi Simhi and Jonathan Herzig and Martin Tutek and Itay Itzhak and Idan Szpektor and Yonatan Belinkov}, eprint={2510.00857}, archivePrefix={arXiv}, url={https://arxiv.org/abs/2510.00857}, year={2025}}
提供机构:
maas
创建时间:
2025-10-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作