Balderdash Game Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/ParsaHejabi/Simulation-Framework-for-Multi-Agent-Balderdash
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了多轮Balderdash游戏,这些游戏是由不同的LLM代理参与的,记录了它们生成的定义、投票行为以及在不同指标上的得分。此外,数据集还包括了如真实定义比例、欺骗比例、正确猜测比例以及平均分数等关键指标,这些都是分析LLM性能的重要数据。在规模上,该数据集包含了多轮游戏,每轮游戏都有固定数量的K名玩家参与。该任务旨在评估大型语言模型在游戏环境中的创造力和欺骗能力。
This dataset contains multiple rounds of the Balderdash game participated in by different LLM Agents. It records the definitions they generated, voting behaviors, and their scores across various metrics. In addition, the dataset includes key metrics such as the proportion of genuine definitions, deception rate, correct guess rate, and average score, all of which are critical data for analyzing LLM performance. In terms of scale, the dataset comprises multiple game rounds, with a fixed number of K players participating in each round. This task aims to evaluate the creativity and deception capabilities of large language models in a game environment.
提供机构:
Simulation framework implemented by the authors



