PokerBench 扑克游戏评估数据集
收藏超神经2025-01-17 更新2025-01-18 收录
下载链接:
https://hyper.ai/cn/datasets/37218
下载链接
链接失效反馈官方服务:
资源简介:
PokerBench 是一个由加州大学伯克利分校和佐治亚理工学院的研究团队于 2025 年开发的扑克游戏评估数据集,旨在评估大型语言模型 (LLMs) 在复杂、战略性的扑克游戏中的表现,相关论文成果为「PokerBench: Training Large Language Models to become Professional Poker Players」。该数据集包含 11k 个关键场景,分为 1k 个前翻牌和 10k 个后翻牌场景,涵盖了广泛的游戏情况。
PokerBench is a poker game evaluation dataset developed in 2025 by research teams from the University of California, Berkeley and the Georgia Institute of Technology. It aims to evaluate the performance of large language models (LLMs) in complex, strategic poker games. The associated academic paper is titled "PokerBench: Training Large Language Models to become Professional Poker Players". This dataset contains 11k key scenarios, which are divided into 1k pre-flop and 10k post-flop scenarios, covering a wide range of game situations.
创建时间:
2025-01-16
搜集汇总
数据集介绍

背景与挑战
背景概述
PokerBench是由加州大学伯克利分校和佐治亚理工学院于2025年开发的扑克游戏评估数据集,包含11k个关键场景,基于游戏理论最优策略构建,用于评估大型语言模型在扑克中的数学推理和策略规划能力。
以上内容由遇见数据集搜集并总结生成



