yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-5e-6_9x9_6-6_gt-SFT_ans1-7k-eval_results
收藏Hugging Face2025-12-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-5e-6_9x9_6-6_gt-SFT_ans1-7k-eval_results
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含一系列步骤,每个步骤包含多个属性:动作(action)、聊天完成情况(chat_completions,包括内容和角色)、完成状态(done)、mc_return、模型响应(model_response)、观察(observation)和奖励(reward)。此外,数据集还包含一个顶层的奖励特征。数据集分为训练集,包含102,400个样本,总大小为173,348,570字节,下载大小为7,436,336字节。
The dataset consists of a series of steps, each containing multiple attributes: action, chat_completions (including content and role), done, mc_return, model_response, observation, and reward. Additionally, there is a top-level reward feature. The dataset is split into a training set with 102,400 examples, a total size of 173,348,570 bytes, and a download size of 7,436,336 bytes.
提供机构:
yuruny



