chengranukkkk/SWE-bench_Verified
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/chengranukkkk/SWE-bench_Verified
下载链接
链接失效反馈官方服务:
资源简介:
SWE-bench Verified是SWE-bench测试集的一个子集,包含500个经过人工验证的样本。SWE-bench是一个用于测试系统自动解决GitHub问题能力的数据集。该数据集收集了来自流行Python仓库的500个测试问题-拉取请求对,评估通过使用拉取请求后行为作为参考解决方案的单元测试验证进行。原始SWE-bench数据集作为SWE-bench: Can Language Models Resolve Real-World GitHub Issues?的一部分发布。数据集支持的任务是:给定完整仓库和GitHub问题,解决问题。数据集文本主要为英文。
SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 500 test Issue-Pull Request pairs from popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. The original SWE-bench dataset was released as part of SWE-bench: Can Language Models Resolve Real-World GitHub Issues?. The dataset supports a task of issue resolution provided a full repository and GitHub issue. The text of the dataset is primarily English.
提供机构:
chengranukkkk



