eth-sri/SWT-bench_Verified_bm25_27k_zsp
收藏Hugging Face2025-02-25 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/eth-sri/SWT-bench_Verified_bm25_27k_zsp
下载链接
链接失效反馈官方服务:
资源简介:
SWT-bench Verified是SWT-bench数据集的一个子集,用于评估系统自动复现GitHub问题的能力。该数据集收集了来自11个流行Python GitHub项目的433个测试Issue-Pull Request对。数据集采用了Pyserini的BM25检索格式,并且代码上下文大小限制为27,000个cl100k_base标记。该数据集可以直接用于语言模型生成补丁文件,并采用了ZeroShotPlus格式提示。
SWT-bench Verified is a subset of the SWT-bench dataset, designed to evaluate the ability of systems to automatically reproduce GitHub issues. The dataset consists of 433 test Issue-Pull Request pairs from 11 popular Python GitHub projects. The dataset uses Pyserinis BM25 retrieval format and limits the code context size to 27,000 cl100k_base tokens. It can be used directly with language models to generate patch files, formatted with the ZeroShotPlus prompt format.
提供机构:
eth-sri



