vitercik-lab/DSR-Bench-challenge
收藏Hugging Face2025-05-16 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/vitercik-lab/DSR-Bench-challenge
下载链接
链接失效反馈官方服务:
资源简介:
DSR-Bench-challenge是一个DSR-Bench问题的子集,具有更复杂和非标准的数据结构,包含多步推理的任务。它包含15种数据结构和450个问题。我们测试了4种推理模型和5种指令微调模型,但没有任何一个模型在DSR-Bench-challenge上的得分超过0.5/1。该数据集旨在揭示现有模型在结构推理能力上的不足,并推动对大型语言模型结构推理能力的研究。
DSR-Bench-challenge is a subset of DSR-Bench problems characterized by more complex and non-standard data structures, with tasks of multi-step reasoning. It contains 15 data structures and 450 questions. We tested 4 reasoning models and 5 instruction-tuned models, but none of the models could score above 0.5/1 on DSR-Bench-challenge. The dataset is designed to highlight the remaining weaknesses of existing models in structural reasoning capacity and to promote future research into the structural reasoning capabilities of large language models.
提供机构:
vitercik-lab



