Loong

arXiv2025-09-30 收录

下载链接：

https://github.com/mozerwang/loong

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个创新的基准，旨在通过扩展的多文档问答任务来评估长上下文语言模型，其设计理念与现实场景相契合。Loong数据集引入了四种类型的任务：焦点定位、比较、聚类和推理链，旨在全面评估模型对长上下文的理解能力。该数据集覆盖了不同任务中的多种上下文长度，任务类型为扩展的多文档问答。

This dataset is an innovative benchmark designed to evaluate long-context language models via extended multi-document question answering tasks, with its design philosophy aligned with real-world application scenarios. The Loong Dataset introduces four task categories: focus localization, comparison, clustering, and reasoning chain, which aim to comprehensively assess models' long-context understanding capabilities. This dataset covers varying context lengths across diverse tasks, all falling under the scope of extended multi-document question answering.

5,000+

优质数据集

54 个

任务类型

进入经典数据集