five

codingmonster1234/chess-puzzles-rlvr

收藏
Hugging Face2026-04-21 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/codingmonster1234/chess-puzzles-rlvr
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: fen dtype: string - name: rating dtype: int64 - name: tags list: string - name: turn dtype: string - name: uci_moves list: string - name: uuid dtype: string splits: - name: train num_bytes: 867658667 num_examples: 4278346 - name: validation num_bytes: 102119282 num_examples: 503581 - name: test num_bytes: 51001348 num_examples: 251434 download_size: 786486627 dataset_size: 1020779297 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* license: mit language: - en size_categories: - 1M<n<10M --- # Dataset Card: Chess-Puzzles-RLVR ## Dataset Summary This dataset is a highly processed and stratified collection of approximately **5 million chess puzzles**, ranging from Elo ratings of **400 to 3300**. It is specifically designed for **Curriculum Learning** and **Reinforcement Learning (RLVR)** agents. Unlike standard puzzle datasets, this version is pre-sorted and split into "rating buckets" to ensure that training, validation, and testing sets maintain an identical difficulty distribution across the entire spectrum. --- ## Dataset Structure ### Data Instances Each instance represents a unique chess puzzle with a starting position (FEN) and the correct sequence of moves (UCI). ### Key Fields (Schema) * **`fen`** *(string)*: The Forsyth-Edwards Notation representing the board state before the first move of the puzzle. * **`uci_moves`** *(list of strings)*: The sequence of best moves in Universal Chess Interface (UCI) format (e.g., `["e2e4", "e7e5"]`). * **`rating`** *(int)*: The difficulty rating of the puzzle (Elo). * **`tags`** *(list of strings)*: Tactical motifs associated with the puzzle (e.g., `["fork", "sacrifice", "mateIn2"]`). * **`turn`** *(string)*: Indicates which side is to move in the starting FEN (`"White"` or `"Black"`). ### Data Splits The dataset is split into three parts, with each split containing a proportionate amount of data from every 100-point rating interval: | Split | Percentage | Purpose | | :--- | :--- | :--- | | **Train** | 85% | Primary data for model training. | | **Validation** | 10% | Monitoring performance and preventing catastrophic forgetting across difficulty tiers. | | **Test** | 5% | Final holdout set for objective evaluation. | --- ## Creation Process ### 1. Data Cleaning and Transformation The dataset was transformed from a raw chess puzzle format using the `python-chess` library. The following steps were taken for every row: * **Turn Extraction**: The active color was parsed directly from the FEN string. * **String Tokenization**: Raw space-separated strings for `moves` and `tags` were converted into clean Python lists for easier model consumption. * **Feature Pruning**: Redundant boolean flags (e.g., `white_kingside`, `board`) were removed to reduce the dataset footprint and focus strictly on necessary state representation. ### 2. Stratified Bucketing To facilitate curriculum learning, the dataset underwent a unique **Stratified Bucketing** process: 1. The entire dataset was sorted globally by **rating**. 2. The data was partitioned into **29 buckets**, each representing a 100-point rating range (e.g., 400-500, 501-600, ..., 3200-3300). 3. The 85/10/5 split was applied **locally within each bucket**. 4. These local splits were then re-concatenated into the final global `train`, `validation`, and `test` splits. This ensures that whether the model is training on "easy" or "hard" data, the validation set always provides a statistically accurate reflection of the model's ability across the entire difficulty spectrum. --- ## Usage Considerations This dataset is optimized for a **Sliding Window Sampler**. During training, it is recommended to: 1. Sample **80%** of your batch from the model's current "target" rating bucket. 2. Sample **20%** from all previously learned (easier) buckets to maintain tactical proficiency and prevent regression.

### 数据集信息 #### 特征 1. **`fen`**:数据类型为字符串,用于表示谜题第一步走子前棋盘状态的菲茨西蒙斯-爱德华兹记法(Forsyth-Edwards Notation,FEN) 2. **`rating`**:数据类型为64位整数,代表谜题的埃洛等级分难度评分 3. **`tags`**:数据类型为字符串列表,存储与谜题相关的战术主题标签 4. **`turn`**:数据类型为字符串,指示初始FEN记法中轮到行棋的一方(`"白方"`或`"黑方"`) 5. **`uci_moves`**:数据类型为字符串列表,存储采用通用国际象棋接口(Universal Chess Interface,UCI)格式的最优走子序列 #### 数据拆分 1. **训练集**:字节数696,475,259,样本量4,278,346 2. **验证集**:字节数81,978,341,样本量503,581 3. **测试集**:字节数40,931,135,样本量251,434 #### 元数据 - 下载大小:606,009,545 字节 - 数据集总大小:819,384,735 字节 #### 配置 默认配置对应的数据文件路径如下: - 训练集:`data/train-*` - 验证集:`data/validation-*` - 测试集:`data/test-*` - 许可证:MIT协议 - 语言:英语 - 规模分类:100万<样本数<1000万 --- ## 数据集卡片:Chess-Puzzles-RLVR ### 数据集概述 本数据集为经过高度预处理与分层整理的国际象棋谜题集合,共包含约500万个谜题,其埃洛等级分区间为400至3300。本数据集专为课程学习(Curriculum Learning)与强化学习(Reinforcement Learning,RLVR)智能体设计。 与标准谜题数据集不同,本版本已预先排序并划分为“等级分桶”,以确保训练集、验证集与测试集在全难度区间内保持一致的难度分布。 --- ### 数据集结构 #### 数据实例 每个实例对应一个唯一的国际象棋谜题,包含谜题第一步走子前的棋盘初始状态与正确走子序列。 #### 关键字段规范 | 字段名 | 数据类型 | 说明 | | :--- | :--- | :--- | | **`fen`** | 字符串 | 表示谜题第一步走子前棋盘状态的菲茨西蒙斯-爱德华兹记法(FEN) | | **`uci_moves`** | 字符串列表 | 采用通用国际象棋接口(UCI)格式的最优走子序列,示例:`["e2e4", "e7e5"]` | | **`rating`** | 整数 | 谜题的埃洛等级分难度评分 | | **`tags`** | 字符串列表 | 与谜题相关的战术主题标签,示例:`["双攻", "弃子", "两步杀"]` | | **`turn`** | 字符串 | 指示初始FEN记法中轮到走子的一方,可选值为`"白方"`或`"黑方"` | #### 数据拆分 本数据集划分为三部分,每个拆分均包含来自每100分等级分区间的成比例数据: | 拆分名称 | 占比 | 用途 | | :--- | :--- | :--- | | **训练集** | 85% | 用于模型训练的核心数据 | | **验证集** | 10% | 用于监控模型性能,防止在不同难度层级上出现灾难性遗忘 | | **测试集** | 5% | 用于客观评估的最终留存数据集 | --- ### 构建流程 #### 1. 数据清洗与转换 本数据集基于原始国际象棋谜题格式,通过`python-chess`库转换得到。针对每一行数据均执行了以下步骤: - **走方提取**:直接从FEN字符串中解析出当前行棋方 - **字符串分词**:将原始以空格分隔的`moves`与`tags`字符串转换为规范的Python列表,便于模型消费 - **特征裁剪**:移除冗余的布尔标记(例如`white_kingside`、`board`),以减小数据集体积,并仅保留必要的状态表征信息 #### 2. 分层分桶 为支持课程学习,本数据集采用了独特的**分层分桶**流程: 1. 对全量数据集按等级分进行全局排序 2. 将数据划分为29个分桶,每个分桶对应100分的等级分区间(例如400-500、501-600……3200-3300) 3. 在每个分桶内部执行85/10/5的拆分比例 4. 将各分桶内的本地拆分结果重新拼接,得到最终的全局训练集、验证集与测试集拆分 此流程可确保无论模型在“简单”还是“困难”数据上训练,验证集始终能从统计层面准确反映模型在全难度区间内的性能表现。 --- ### 使用注意事项 本数据集针对**滑动窗口采样器**进行了优化。在训练过程中,建议遵循以下策略: 1. 从模型当前的“目标”等级分桶中采样80%的批次数据 2. 从所有已学习过的(更简单的)分桶中采样20%的批次数据,以保持战术熟练度并防止性能退化
提供机构:
codingmonster1234
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作