yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-1e-5_9x9_6-6_gt-SFT_ans1-non_markovian-eval_results_dev
收藏Hugging Face2025-12-11 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yuruny/agentic-sudoku-NonMarkov_qwen2.5-3B-it-1e-5_9x9_6-6_gt-SFT_ans1-non_markovian-eval_results_dev
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含一系列步骤,每个步骤包含多个属性,如动作、聊天完成情况、奖励和观察结果。具体特征包括:动作(字符串类型)、聊天完成情况(包含内容和角色的列表)、完成状态(布尔类型)、mc_return(浮点类型)、模型响应(字符串类型)、观察结果(字符串类型)和奖励(浮点类型)。数据集分为一个训练集,包含100个示例,总大小为559,604字节。
The dataset consists of a series of steps, each containing various attributes such as actions, chat completions, rewards, and observations. The features include: action (string), chat completions (a list with content and role), done (boolean), mc_return (float64), model_response (string), observation (string), and reward (float64). The dataset is split into a single train split with 100 examples and a total size of 559,604 bytes.
提供机构:
yuruny



