TReB
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/TReB
下载链接
链接失效反馈官方服务:
资源简介:
该数据集为评估大型语言模型在表格推理方面的能力提供了一个全面的基准,涵盖了从浅层的表格理解能力到深层的表格推理能力,共计26个子任务。数据集的构建采用了关键词检索与人工筛选相结合的混合策略,并包含了一个统一的后处理流程,以确保任务的准确性和一致性。该数据集旨在对表格推理能力进行评估。
This dataset serves as a comprehensive benchmark for evaluating the table reasoning capabilities of large language models (LLMs). It includes 26 subtasks ranging from shallow table understanding to deep table reasoning abilities. The dataset is developed through a hybrid strategy combining keyword retrieval and manual screening, and adopts a unified post-processing workflow to ensure the accuracy and consistency of each task. This dataset is designed to assess table reasoning capabilities.



