MACHIAVELLI
收藏arXiv2023-06-13 更新2024-06-21 收录
下载链接:
https://aypan17.github.io/machiavelli
下载链接
链接失效反馈官方服务:
资源简介:
MACHIAVELLI是一个包含134个选择你自己的冒险游戏的基准,由加州大学伯克利分校的Alexander Pan等人创建。该数据集包含超过57万种丰富多样的社会决策场景,旨在评估通用模型如GPT-4在交互环境中的行为。数据集通过自动化标注,使用语言模型进行场景标注,效率高于人工标注。MACHIAVELLI不仅评估模型的能力,还评估其不道德倾向,如权力寻求、减少效用和伦理违规。通过密集标注环境,能够构建代理的行为报告,并测量奖励与伦理行为之间的权衡。
MACHIAVELLI is a benchmark comprising 134 choose-your-own-adventure games, developed by Alexander Pan et al. from the University of California, Berkeley. This dataset contains over 570,000 rich and diverse social decision-making scenarios, designed to evaluate the performance of general-purpose models such as GPT-4 in interactive environments. The dataset employs automated annotation, where language models are used to label scenarios, delivering higher efficiency than manual annotation. MACHIAVELLI not only assesses the capabilities of models but also examines their unethical tendencies, including power-seeking behavior, utility reduction, and ethical violations. With densely annotated environments, this benchmark enables the construction of behavioral reports for agents and the measurement of trade-offs between rewards and ethical behavior.
提供机构:
加州大学伯克利分校
创建时间:
2023-04-07



