five

cetusian/chess-sft-mix-200k

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cetusian/chess-sft-mix-200k
下载链接
链接失效反馈
官方服务:
资源简介:
Chess SFT Mix是一个用于竞争性象棋LLM训练的监督微调数据集,包含约40万行数据。它由三个互补的信号源组成:GM游戏(≥2200 Elo)、Lichess谜题(按评分分层)和Stockfish增强的位置(深度15)。数据集采用统一的聊天格式和单一系统提示,旨在训练模型在给定象棋位置时以标准代数符号(SAN)走出最佳棋步。数据集分为训练集和验证集,训练集包含391,999行,验证集包含8,000行。此外,README还详细介绍了数据集的格式、组件细节、统计信息、使用方法、设计注意事项、局限性和来源及许可信息。

Chess SFT Mix is a supervised fine-tuning dataset assembled for competitive chess LLM training, comprising approximately 400,000 rows. It integrates three complementary sources of signal: GM games (≥2200 Elo), Lichess puzzles (rating-stratified), and Stockfish-enriched positions (depth 15). The dataset is unified in a chat format with a single system prompt, designed to teach the model to play the best move in standard algebraic notation (SAN) given a chess position. The dataset is split into training and validation sets, with 391,999 rows for training and 8,000 for validation. The README also provides detailed information on the format, component details, statistics, usage, design notes, limitations, and source & license.
提供机构:
cetusian
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作