InfiniteBench
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/openbmb/infinitebench
下载链接
链接失效反馈官方服务:
资源简介:
该数据集被用作评估在无限上下文设置中长上下文模型的基准,同时也用于展示EM-LLM在长上下文任务上的卓越性能。该数据集的规模较大,上下文长度可达1000万个标记,其任务是对语言模型在无限上下文任务上的性能进行比较。
This dataset serves as a benchmark for evaluating long-context models under the infinite context setting, and is also utilized to demonstrate the superior performance of EM-LLM on long-context tasks. As a large-scale dataset, it has a maximum context length of up to 10 million tokens, and its core task is to compare the performance of language models on infinite context tasks.



