bbateni/Solitaire-Rule-Reasoning-Benchmark
收藏Hugging Face2025-10-11 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/bbateni/Solitaire-Rule-Reasoning-Benchmark
下载链接
链接失效反馈官方服务:
资源简介:
Solitaire规则推理基准数据集是一个评估大型语言模型在简单规则推理方面的性能的基准数据集。该数据集包含多种Solitaire纸牌游戏变体的游戏数据,每个变体有1000个样本,这些样本是在100次模拟游戏中达到的可能状态。每个样本包括游戏ID、当前游戏状态、当前游戏视图、拟议的行动、行动总结、行动的有效性以及下一个游戏状态视图。
The Solitaire Rule Reasoning Benchmark is a benchmark dataset for evaluating the performance of large language models (LLMs) on simple rule reasoning. The dataset includes game data from multiple Solitaire variants, with each variant containing 1000 samples representing possible states reached during 100 simulated games. Each sample includes a game ID, the current game state, the current game view, the proposed action, a summary of the action, the validity of the action, and the next game state view.
提供机构:
bbateni



