RepoQA
收藏arXiv2025-09-30 收录
下载链接:
https://evalplus.github.io/repoqa.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个针对大型语言模型在长上下文代码理解和检索能力方面的高级“针尖对麦芒”测试。它设定了上下文大小为16,000个标记,并包含了500个子任务/测试(来自5种语言的每个语言中的10个needle函数,每个函数再乘以10个代码库)。任务的性质是代码检索。
This dataset is an advanced "needle-in-haystack" test designed to evaluate the long-context code understanding and retrieval capabilities of large language models (LLMs). It has a fixed context size of 16,000 tokens and includes 500 subtasks/tests: 10 needle functions per language across 5 languages, with each function paired with 10 codebases. The core task of this dataset is code retrieval.



