yuruny/agentic-sudoku-Markov-qwen2.5-3B_9x9_6-6_SFT-5e-6-ans1-6k_prm0_actor-eval_results
收藏Hugging Face2025-12-18 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yuruny/agentic-sudoku-Markov-qwen2.5-3B_9x9_6-6_SFT-5e-6-ans1-6k_prm0_actor-eval_results
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个特征,其中steps是一个列表,包含action(动作,字符串类型)、chat_completions(聊天完成,包含content(内容,字符串类型)和role(角色,字符串类型))、done(完成状态,布尔类型)、mc_return(多类别返回,浮点类型)、model_response(模型响应,字符串类型)、observation(观察结果,字符串类型)和reward(奖励,浮点类型)。此外,还有一个顶层的reward特征(浮点类型)。数据集包含一个名为train的分割,包含12800个示例,总大小为118083221字节。下载大小为232762字节,数据集大小为118083221字节。默认配置指定了训练分割的数据文件路径。
The dataset includes multiple features, with steps being a list that contains action (string), chat_completions (containing content (string) and role (string)), done (boolean), mc_return (float64), model_response (string), observation (string), and reward (float64). There is also a top-level reward feature (float64). The dataset has a single split named train with 12,800 examples and a total size of 118,083,221 bytes. The download size is 232,762 bytes, and the dataset size is 118,083,221 bytes. The default configuration specifies the data file path for the train split.
提供机构:
yuruny



