vitercik-lab/DSR-Bench

Name: vitercik-lab/DSR-Bench
Creator: vitercik-lab
Published: 2025-10-14 00:50:50
License: 暂无描述

Hugging Face2025-10-14 更新2025-11-01 收录

下载链接：

https://hf-mirror.com/datasets/vitercik-lab/DSR-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

DSR-Bench是一个为测试大型语言模型结构推理能力设计的基准数据集。该数据集按照结构复杂度递增的方式组织任务，能够对特定的推理技能进行细致分析。每个类别内部设计了不同任务来隔离不同的结构复杂度来源，使结构推理可以分解为越来越具挑战性的任务。每个数据结构任务都具有简洁明了的正确最终状态，支持确定性的评分，而不需要人工或基于模型的判断。所有任务都是通过合成的分布高效生成的，大大降低了来自预训练数据的污染风险，这也使得可以进行大规模评估，而无需太多的人工参与。

DSR-Bench is a benchmark designed for testing the structural reasoning ability of large language models. The dataset contains 6 categories of 20 data structures and 30 operations, totaling 2700 questions. The tasks are organized by increasing structural complexity, allowing for a fine-grained analysis of specific reasoning skills. Within each category, a range of tasks is designed to isolate different sources of structural complexity, breaking down structural reasoning into progressively more challenging tasks. Each data structure task has a concise and well-defined correct final state, supporting deterministic scoring without the need for human or model-based judgment. All tasks are efficiently generated from synthetic distributions, significantly reducing the risk of contamination from pretraining data, enabling large-scale evaluation with minimal human involvement.

提供机构：

vitercik-lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集