yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-5e-6_9x9_6-6_gt-SFT_ans1-7k-eval_results

Name: yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-5e-6_9x9_6-6_gt-SFT_ans1-7k-eval_results
Creator: yuruny
Published: 2025-12-14 16:33:49
License: 暂无描述

Hugging Face2025-12-14 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-5e-6_9x9_6-6_gt-SFT_ans1-7k-eval_results

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含一系列步骤，每个步骤包含多个属性：动作（action）、聊天完成情况（chat_completions，包括内容和角色）、完成状态（done）、mc_return、模型响应（model_response）、观察（observation）和奖励（reward）。此外，数据集还包含一个顶层的奖励特征。数据集分为训练集，包含102,400个样本，总大小为173,348,570字节，下载大小为7,436,336字节。

The dataset consists of a series of steps, each containing multiple attributes: action, chat_completions (including content and role), done, mc_return, model_response, observation, and reward. Additionally, there is a top-level reward feature. The dataset is split into a training set with 102,400 examples, a total size of 173,348,570 bytes, and a download size of 7,436,336 bytes.

提供机构：

yuruny