vitercik-lab/DSR-Bench-main

Name: vitercik-lab/DSR-Bench-main
Creator: vitercik-lab
Published: 2025-05-16 00:01:09
License: 暂无描述

Hugging Face2025-05-16 更新2025-11-01 收录

下载链接：

https://hf-mirror.com/datasets/vitercik-lab/DSR-Bench-main

下载链接

链接失效反馈

官方服务：

资源简介：

DSR-Bench是一个为大型语言模型设计的基准测试，旨在测试它们的结构推理能力：理解并根据特定的关系如顺序、层次结构和连通性操纵数据的能力。它包含6个类别的20种数据结构，30种操作，共有2700个问题。DSR-Bench的优势包括：层次组织、确定性评估和低污染数据。任务按照 increasing structural complexity 组织，允许对特定的推理技能进行细粒度分析。每个数据结构任务都有一个简洁、明确定义的最终正确状态，允许进行确定性且明确的评分。所有任务都是从合成分布中高效生成的，大大降低了来自预训练数据的污染风险。

DSR-Bench is a benchmark for LLMs designed to test their structural reasoning ability: the ability to understand and manipulate data according to specific relationships such as order, hierarchy, and connectivity. It contains 6 categories of 20 data structures, 30 operations, summing up to a total of 2700 questions. The strengths of DSR-Bench include hierarchical organization, deterministic evaluation, and low-contamination data. Tasks are organized by increasing structural complexity, allowing for fine-grained analysis of specific reasoning skills. Each data structure task has a concise and well-defined correct final state, allowing for deterministic and unambiguous scoring. All tasks are generated efficiently from synthetic distributions, significantly reducing contamination risks from pretraining data.

提供机构：

vitercik-lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集