pm-25/clembench-rlvr-dataset
收藏Hugging Face2025-08-05 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/pm-25/clembench-rlvr-dataset
下载链接
链接失效反馈官方服务:
资源简介:
Clembench RLVR数据集是一个完整的游戏结果数据集,包括胜利和失败的结果。它由SFT-Final数据集和DPO_dialogue数据集组合而成。数据集中的每个样本都有唯一的标识符,包含查询字符串、奖励值、样本来源、玩家标识以及格式标记。此数据集用于研究游戏中的决策制定和结果分析。
The Clembench RLVR Dataset is a comprehensive dataset of game outcomes, including wins and losses. It combines the SFT-Final Dataset and the DPO_dialogue Dataset. Each sample in the dataset has a unique identifier, containing a query string, reward value, sample origin, player identifier, and format marker. This dataset is used for studying decision-making and outcome analysis in games.
提供机构:
pm-25



