SWE-bench/SWE-bench
收藏Hugging Face2025-04-29 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/SWE-bench/SWE-bench
下载链接
链接失效反馈官方服务:
资源简介:
SWE-bench是一个测试系统自动解决GitHub问题能力的数据库。该数据库从12个流行的Python仓库中收集了2,294个问题-拉取请求对。评估是通过单元测试验证,使用post-PR行为作为参考解决方案。数据集作为论文SWE-bench: Can Language Models Resolve Real-World GitHub Issues?的一部分发布。
SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. The dataset was released as part of the paper SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
提供机构:
SWE-bench



