SWE-bench_bm25_13k_cl100k
收藏魔搭社区2025-12-05 更新2025-08-16 收录
下载链接:
https://modelscope.cn/datasets/princeton-nlp/SWE-bench_bm25_13k_cl100k
下载链接
链接失效反馈官方服务:
资源简介:
### Dataset Summary
SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.
### Supported Tasks and Leaderboards
SWE-bench proposes a new task: issue resolution provided a full repository and GitHub issue. The leaderboard can be found at www.swebench.com
### Languages
The text of the dataset is primarily English, but we make no effort to filter or otherwise clean based on language type.
## Dataset Structure
### Data Instances
An example of a SWE-bench datum is as follows:
```
instance_id: (str) - A formatted instance identifier, usually as repo_owner__repo_name-PR-number.
patch: (str) - The gold patch, the patch generated by the PR (minus test-related code), that resolved the issue.
repo: (str) - The repository owner/name identifier from GitHub.
base_commit: (str) - The commit hash of the repository representing the HEAD of the repository before the solution PR is applied.
hints_text: (str) - Comments made on the issue prior to the creation of the solution PR’s first commit creation date.
created_at: (str) - The creation date of the pull request.
test_patch: (str) - A test-file patch that was contributed by the solution PR.
Problem_statement: (str) - The issue title and body.
Version: (str) - Installation version to use for running evaluation.
environment_setup_commit: (str) - commit hash to use for environment setup and installation.
FAIL_TO_PASS: (str) - A json list of strings that represent the set of tests resolved by the PR and tied to the issue resolution.
PASS_TO_PASS: (str) - A json list of strings that represent tests that should pass before and after the PR application.
text: (str) - The generated text according to the retrieval criterion and the style-2 prompt found in [github:SWE-bench](https://github.com/princeton-nlp/SWE-bench).
input_ids: (List[int]) - The cl100k_base tokens for each text.
```
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
### 数据集概述
SWE-bench是一款用于评估系统自动解决GitHub议题能力的数据集。该数据集从12个主流Python代码仓库中收集了2294组「议题-拉取请求(Issue-Pull Request)」对。评估环节采用单元测试验证方式,以拉取请求(PR)合并后的代码行为作为参考解法。
### 支持任务与排行榜
SWE-bench提出了一项全新任务:在提供完整代码仓库与GitHub议题的前提下完成议题修复。相关排行榜可访问www.swebench.com查看。
### 语言类型
本数据集的文本主体为英文,且未针对语言类型进行任何过滤或清洗操作。
### 数据集结构
#### 数据实例
SWE-bench的单条数据示例格式如下:
instance_id: (str) - 格式化后的实例标识符,通常采用`repo_owner__repo_name-PR-number`格式。
patch: (str) - 黄金补丁(gold patch),即该拉取请求(PR)生成的、剔除测试相关代码的补丁,用于修复对应议题。
repo: (str) - GitHub上的代码仓库所有者/名称标识符。
base_commit: (str) - 应用修复PR前,代码仓库HEAD指向的提交哈希值。
hints_text: (str) - 对应议题在修复PR首次提交创建日期前的评论内容。
created_at: (str) - 拉取请求的创建时间。
test_patch: (str) - 由修复PR贡献的测试文件补丁。
Problem_statement: (str) - 议题的标题与正文内容。
Version: (str) - 执行评估时需使用的安装版本。
environment_setup_commit: (str) - 用于环境搭建与安装的提交哈希值。
FAIL_TO_PASS: (str) - 一个JSON格式字符串列表,代表该PR解决的、与议题修复相关的测试用例集合。
PASS_TO_PASS: (str) - 一个JSON格式字符串列表,代表在PR应用前后均应通过的测试用例。
text: (str) - 依据检索准则与[github:SWE-bench](https://github.com/princeton-nlp/SWE-bench)中样式2提示生成的文本。
input_ids: (List[int]) - 对应text的cl100k_base编码Token(Token)序列。
[如需更多信息,请访问](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
maas
创建时间:
2025-08-16
搜集汇总
数据集介绍

背景与挑战
背景概述
SWE-bench_bm25_13k_cl100k是一个用于评估系统自动解决GitHub问题能力的数据集,包含2,294个Issue-Pull Request对,主要基于Python项目,并通过单元测试进行验证。该数据集支持以英文为主的文本,并提出了新的任务:在给定完整仓库和GitHub问题的情况下进行问题解决。
以上内容由遇见数据集搜集并总结生成



