InfiniteBench

arXiv2025-09-30 收录

下载链接：

https://github.com/openbmb/infinitebench

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集被用作评估在无限上下文设置中长上下文模型的基准，同时也用于展示EM-LLM在长上下文任务上的卓越性能。该数据集的规模较大，上下文长度可达1000万个标记，其任务是对语言模型在无限上下文任务上的性能进行比较。

This dataset serves as a benchmark for evaluating long-context models under the infinite context setting, and is also utilized to demonstrate the superior performance of EM-LLM on long-context tasks. As a large-scale dataset, it has a maximum context length of up to 10 million tokens, and its core task is to compare the performance of language models on infinite context tasks.

5,000+

优质数据集

54 个

任务类型

进入经典数据集