fang0608/LongBench-Pro

Name: fang0608/LongBench-Pro
Creator: fang0608
Published: 2026-04-27 23:51:52
License: 暂无描述

Hugging Face2026-04-27 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/fang0608/LongBench-Pro

下载链接

链接失效反馈

官方服务：

资源简介：

LongBench Pro是一个更真实、更全面的双语长上下文评估基准，包含1,500个样本，完全基于真实、自然的长文档构建，涵盖11个主要任务和25个次要任务，覆盖了现有基准评估的所有长上下文能力。它采用多样化的评估指标，能够更精细地测量模型能力，并提供平衡的中英文双语样本。此外，LongBench Pro引入了多维分类法，支持在不同操作条件下对模型进行全面评估，包括上下文需求（全局整合与局部检索）、长度（从8k到256k令牌的六个均匀分布长度）和难度（从易到极难的四个级别）。

LongBench Pro, containing 1,500 samples, is entirely built on authentic, natural long documents and includes 11 primary tasks and 25 secondary tasks, covering all long-context capabilities assessed by existing benchmarks. It employs diverse evaluation metrics, enabling a more fine-grained measurement of model abilities, and provides a balanced set of bilingual samples in both English and Chinese. In addition, LongBench Pro introduces a multi-dimensional taxonomy to support a comprehensive evaluation of models under different operating conditions: Context Requirement (Full context versus Partial context), Length (Six lengths uniformly distributed from 8k to 256k tokens), and Difficulty (Four levels ranging from Easy to Extreme).

提供机构：

fang0608

5,000+

优质数据集

54 个

任务类型

进入经典数据集