five

InterwebAlchemy/pgn-lichess-puzzle-dataset

收藏
Hugging Face2026-03-24 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/InterwebAlchemy/pgn-lichess-puzzle-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc0-1.0 language: - en tags: - chess - pgn - puzzles - tactics pretty_name: Chess Puzzles with PGN Context size_categories: - 1K<n<10K --- # Chess Puzzles with PGN Context Tactical chess puzzles drawn from the [Lichess Open Puzzle Database](https://database.lichess.org/#puzzles), augmented with full PGN game context reconstructed from the source Lichess games. Each record presents a middlegame position as a PGN move sequence — the same format used to train PGN language models like [kn1ght](https://github.com/InterwebAlchemy/kn1ght) — together with the engine-validated best move as the label. ## Dataset Summary - **5,000 puzzles** with reconstructed PGN context - Rating range: 1200–1900 (mean: 1540) - Themes: `middlegame`, `short`, `advantage`, `crushing`, `long`, `mate`, `fork`, `kingsideAttack` (middlegame only; opening/endgame excluded) - Splits: 80% train / 10% validation / 10% test ## Schema | Column | Type | Description | |---|---|---| | `puzzle_id` | string | Lichess puzzle ID | | `game_id` | string | Lichess game ID (source of PGN context) | | `rating` | int32 | Puzzle difficulty (Lichess Glicko-2 rating) | | `themes` | list[string] | Tactical theme tags (e.g. `fork`, `pin`, `skewer`) | | `pgn_context` | string | PGN move text up to (not including) the puzzle move | | `fen` | string | Board position at start of puzzle in FEN notation | | `best_move_uci` | string | Correct first move in UCI notation | | `best_move_san` | string | Correct first move in SAN notation | ## Usage ```python from datasets import load_dataset ds = load_dataset("InterwebAlchemy/chess-puzzles-pgn") # Each example: # {'pgn_context': '1.e4 e5 2.Nf3 Nc6 ... 18.Rxd4', # 'best_move_san': 'Nf6+', # 'rating': 1487, # 'themes': ['fork', 'middlegame']} ``` ## How PGN context is reconstructed For each Lichess puzzle: 1. The source game PGN is fetched from `lichess.org/api/game/<id>` 2. The game is replayed move by move until the board FEN matches the puzzle FEN 3. The move-text up to that point is stored as `pgn_context` 4. The first solution move (UCI → SAN) is stored as `best_move_san` Puzzles where the FEN could not be located in the source game are discarded. ## Intended use Evaluating and fine-tuning PGN language models on tactical positions. The `pgn_context` field can be fed directly to any model that generates chess moves as PGN continuations. ## Licensing This dataset is derived from the [Lichess Open Database](https://database.lichess.org/), which is released under [CC0 1.0 (Public Domain)](https://creativecommons.org/publicdomain/zero/1.0/). This derived dataset is also released under CC0 1.0.
提供机构:
InterwebAlchemy
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作