RepoQA

arXiv2025-09-30 收录

下载链接：

https://evalplus.github.io/repoqa.html

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个针对大型语言模型在长上下文代码理解和检索能力方面的高级“针尖对麦芒”测试。它设定了上下文大小为16,000个标记，并包含了500个子任务/测试（来自5种语言的每个语言中的10个needle函数，每个函数再乘以10个代码库）。任务的性质是代码检索。

This dataset is an advanced "needle-in-haystack" test designed to evaluate the long-context code understanding and retrieval capabilities of large language models (LLMs). It has a fixed context size of 16,000 tokens and includes 500 subtasks/tests: 10 needle functions per language across 5 languages, with each function paired with 10 codebases. The core task of this dataset is code retrieval.

5,000+

优质数据集

54 个

任务类型

进入经典数据集