novastar112/sokoban_v0_random
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/novastar112/sokoban_v0_random
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含针对自定义VisGym Sokoban环境的随机策略交互轨迹。默认生成布局包括中等任务配置下的20步滚动上限,每个轨迹包含基本64位JPEG帧、提示、采样动作、奖励和环境信息。随机策略均匀采样有效的移动/推动动作,并在第一步之后以配置的随机停止概率采样有效的停止动作。数据集还包含生成计数、哈希审计和验证元数据。
This dataset contains random-policy interaction trajectories for the custom VisGym Sokoban environment. Default generation layout includes a 20-step rollout cap under medium task configuration, with each trajectory containing base64 JPEG frames, prompt, sampled action, reward, and environment info. The random policy samples valid move/push actions uniformly and samples the valid stop action with the configured random stop probability after the first step. The dataset also includes generation counts, hash audit, and validation metadata.
提供机构:
novastar112



