AIML-TUDA/SLR-Bench-German
收藏Hugging Face2025-10-17 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/AIML-TUDA/SLR-Bench-German
下载链接
链接失效反馈官方服务:
资源简介:
SLR-Bench是一个可扩展的逻辑推理基准数据集,用于评估和训练大型语言模型在逻辑推理任务中的能力。该数据集包括英语和德语两个版本,德语版本是英语版本的翻译。数据集采用结构化的课程,包含20个复杂性级别,分为四个等级:基础、简单、中等和困难。每个任务包括一个自然语言提示、一个可执行的验证程序和一个潜在的真理规则。数据集允许自动生成具有可控复杂性的新归纳推理任务,并支持通过验证程序进行符号自动评估。数据集遵循CC BY 4.0许可证,可供下载和使用。
SLR-Bench is a benchmark for scalable logical reasoning, designed to evaluate and train Large Language Models (LLMs) in logical reasoning tasks. The dataset is available in both English and German versions, with the German version being a translation of the original English dataset. It features a structured curriculum with 20 complexity levels, grouped into four tiers: basic, easy, medium, and hard. Each task includes a natural language prompt, an executable validation program, and a ground-truth rule. The dataset supports automatic task generation with controllable complexity and allows for symbolic automated evaluation. It is licensed under the CC BY 4.0 license and is available for download and use.
提供机构:
AIML-TUDA



