five

zeyuzy/puzzle-bench

收藏
Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/zeyuzy/puzzle-bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: maze_10x10 data_files: - split: train path: maze_10x10/train-* - split: test path: maze_10x10/test-* - config_name: maze_15x15 data_files: - split: train path: maze_15x15/train-* - split: test path: maze_15x15/test-* - config_name: maze_5x5 data_files: - split: train path: maze_5x5/train-* - split: test path: maze_5x5/test-* - config_name: maze_7x7 data_files: - split: train path: maze_7x7/train-* - split: test path: maze_7x7/test-* - config_name: sudoku_4x4 data_files: - split: train path: sudoku_4x4/train-* - split: test path: sudoku_4x4/test-* - config_name: sudoku_9x9 data_files: - split: train path: sudoku_9x9/train-* - split: test path: sudoku_9x9/test-* license: mit task_categories: - text-generation tags: - sudoku - maze - puzzle - constraint-satisfaction - benchmark language: - en size_categories: - 10K<n<100K dataset_info: - config_name: sudoku_4x4 features: - name: puzzle dtype: string - name: solution dtype: string - name: empty_count dtype: int64 - name: source dtype: string splits: - name: train num_bytes: 592000 num_examples: 8000 - name: test num_bytes: 148000 num_examples: 2000 download_size: 136367 dataset_size: 740000 - config_name: sudoku_9x9 features: - name: puzzle dtype: string - name: solution dtype: string - name: empty_count dtype: int64 - name: steps_count dtype: int64 - name: backtrack_count dtype: int64 - name: max_depth dtype: int64 - name: source dtype: string - name: difficulty dtype: string splits: - name: train num_bytes: 9939525 num_examples: 41784 - name: test num_bytes: 2485478 num_examples: 10448 download_size: 6173960 dataset_size: 12425003 --- # Puzzle Bench Difficulty-labeled evaluation datasets for **Sudoku** and **Maze** tasks, designed for benchmarking language models on combinatorial reasoning. **GitHub:** [zeyuzhangzyz/puzzle-bench](https://github.com/zeyuzhangzyz/puzzle-bench) ## Dataset Overview | Config | Total | Train | Test | Difficulty Labels | |--------|-------|-------|------|-------------------| | `sudoku_4x4` | 10,000 | 8,000 | 2,000 | -- | | `sudoku_9x9` | 52,806 | 42,244 | 10,562 | easy / medium / hard | | `maze_5x5` | 10,000 | 8,000 | 2,000 | -- | | `maze_7x7` | 10,000 | 8,000 | 2,000 | -- | | `maze_10x10` | 10,000 | 8,000 | 2,000 | -- | | `maze_15x15` | 30,000 | 24,000 | 6,000 | easy / medium / hard | ## Sudoku ### sudoku_4x4 4x4 Sudoku puzzles generated via backtracking with unique-solution verification. | Column | Description | |--------|-------------| | `puzzle` | 16-character string (0 = empty cell) | | `solution` | 16-character solution | | `empty_count` | Number of blank cells | | `source` | Generator identifier | ### sudoku_9x9 9x9 Sudoku puzzles with solver-computed difficulty metrics. Mixed from multiple sources for balanced difficulty distribution. | Column | Description | |--------|-------------| | `puzzle` | 81-character string (0 = empty cell) | | `solution` | 81-character solution | | `empty_count` | Number of blank cells | | `steps_count` | Solver step count (MRV + backtracking) | | `backtrack_count` | Number of backtracks | | `max_depth` | Maximum recursion depth | | `difficulty` | easy / medium / hard | | `source` | Source dataset identifier | | Difficulty | Count | Criterion | |------------|-------|-----------| | easy | 29,842 | 0 backtracks (pure logic) | | medium | 10,000 | 1-1,000 backtracks | | hard | 12,964 | 1,000+ backtracks | ## Maze Mazes encoded as binary wall strings with BFS-computed path metrics. Algorithms used: dfs, wilson, prim, kruskal, rdiv. | Column | Description | |--------|-------------| | `maze` | Binary string encoding walls | | `start` | Start coordinates `row,col` | | `goal` | Goal coordinates `row,col` | | `grid_size` | Grid dimension (5/7/10/15) | | `algorithm` | Generation algorithm | | `solution_length` | BFS shortest path length | | `bfs_nodes` | BFS nodes expanded | | `source` | Generator identifier | | `difficulty` | easy / medium / hard (maze_15x15 only, by solution_length tercile) | ### maze_15x15 difficulty breakdown | Difficulty | Train | Test | Total | |------------|-------|------|-------| | easy | 8,000 | 2,000 | 10,000 | | medium | 8,000 | 2,000 | 10,000 | | hard | 8,000 | 2,000 | 10,000 | ## Usage ```python from datasets import load_dataset # 4x4 Sudoku sudoku_4x4 = load_dataset("zeyuzy/puzzle-bench", "sudoku_4x4") # 9x9 Sudoku, hard difficulty only sudoku_9x9 = load_dataset("zeyuzy/puzzle-bench", "sudoku_9x9") hard = sudoku_9x9["test"].filter(lambda x: x["difficulty"] == "hard") # Maze with train/test split maze_5x5 = load_dataset("zeyuzy/puzzle-bench", "maze_5x5") # Maze 15x15 with difficulty labels maze_15x15 = load_dataset("zeyuzy/puzzle-bench", "maze_15x15") hard_maze = maze_15x15["test"].filter(lambda x: x["difficulty"] == "hard") ``` ## Citation ```bibtex @software{puzzle-bench, title={Puzzle Bench}, author={Zhang, Zeyu}, url={https://github.com/zeyuzhangzyz/puzzle-bench}, year={2026} } ```
提供机构:
zeyuzy
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作