yuruny/agentic-sudoku-Markov_qwen2.5-3B-it-1e-5_9x9_6-6_gt-SFT_ans1-markovian-eval_results
收藏Hugging Face2025-12-11 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yuruny/agentic-sudoku-Markov_qwen2.5-3B-it-1e-5_9x9_6-6_gt-SFT_ans1-markovian-eval_results
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个步骤特征,每个步骤包括动作、聊天完成情况、完成状态、mc_return、模型响应、观察和奖励等子特征。此外,数据集还包含一个顶层的奖励特征。数据集包含一个名为train的分割,共有100个示例,总大小为361,886字节。
The dataset includes multiple step features, each containing sub-features such as action, chat completions, done status, mc_return, model response, observation, and reward. Additionally, the dataset features a top-level reward attribute. It consists of a single split named train with 100 examples and a total size of 361,886 bytes.
提供机构:
yuruny



