five

max-rl/maze_17x17_diverse_1.3m

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/max-rl/maze_17x17_diverse_1.3m
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个名为17x17 Maze with Diverse Paths (1.3M, 16 paths/maze)的数据集,主要用于强化学习和推理任务。数据集包含17x17的迷宫网格,每个迷宫有16条不同的路径。通过打破部分墙壁来增加路径多样性,确保路径长度分布广泛,从而使得连续奖励函数与二元奖励函数有显著差异。数据集包含1,299,992个迷宫,每个迷宫平均有15.32条路径,路径长度在28到58之间,连续奖励均值为0.518,标准差为0.283。数据集文件包括原始构建输出、SFT-ready的训练文件和测试文件。

This is a dataset named 17x17 Maze with Diverse Paths (1.3M, 16 paths/maze), primarily used for reinforcement learning and reasoning tasks. The dataset consists of 17x17 maze grids, each with 16 diverse paths. By knocking out some walls, the dataset ensures a wide range of path lengths, making the continuous reward function significantly different from the binary reward function. The dataset includes 1,299,992 mazes, with an average of 15.32 paths per maze, path lengths ranging from 28 to 58, and a continuous reward mean of 0.518 with a standard deviation of 0.283. The dataset files include raw build output, SFT-ready training files, and test files.
提供机构:
max-rl
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作