ethanjtang/GAMBIT-lichess-puzzle-positions
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ethanjtang/GAMBIT-lichess-puzzle-positions
下载链接
链接失效反馈官方服务:
资源简介:
GAMBIT数据集是一个用于测试国际象棋训练语言模型泛化或记忆能力的脆弱性测试数据集。该数据集来源于Lichess Puzzle Database,包含训练和验证两部分。训练集包含大量棋局位置和最佳走法对,验证集则按主题分类,每个主题包含1000个样本,总共有72625个独特的验证谜题。数据集以文本格式存储,每个棋局位置以FEN格式表示,并附有最佳走法的UCI和SAN格式。训练集和验证集的谜题是互斥的,确保验证的独立性。数据集适用于国际象棋相关的语言模型训练和验证。
The GAMBIT dataset is a brittleness testing dataset designed to evaluate whether chess-trained language models generalize or memorize. It is derived from the Lichess Puzzle Database and consists of training and validation splits. The training set contains a large collection of puzzle positions paired with their best moves, while the validation set includes 1000 samples per theme, totaling 72,625 unique validation puzzles. The data is stored in text format, with each position represented in FEN notation and accompanied by best moves in both UCI and SAN formats. The training and validation puzzles are disjoint to ensure independent evaluation. This dataset is suitable for training and validating chess-related language models.
提供机构:
ethanjtang



