SWE-bench_oracle_cl100k

Name: SWE-bench_oracle_cl100k
Creator: maas
Published: 2025-12-05 16:46:44
License: 暂无描述

魔搭社区2025-12-05 更新2025-10-04 收录

下载链接：

https://modelscope.cn/datasets/princeton-nlp/SWE-bench_oracle_cl100k

下载链接

链接失效反馈

官方服务：

资源简介：

### Dataset Summary SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. ### Supported Tasks and Leaderboards SWE-bench proposes a new task: issue resolution provided a full repository and GitHub issue. The leaderboard can be found at www.swebench.com ### Languages The text of the dataset is primarily English, but we make no effort to filter or otherwise clean based on language type. ## Dataset Structure ### Data Instances An example of a SWE-bench datum is as follows: ``` instance_id: (str) - A formatted instance identifier, usually as repo_owner__repo_name-PR-number. patch: (str) - The gold patch, the patch generated by the PR (minus test-related code), that resolved the issue. repo: (str) - The repository owner/name identifier from GitHub. base_commit: (str) - The commit hash of the repository representing the HEAD of the repository before the solution PR is applied. hints_text: (str) - Comments made on the issue prior to the creation of the solution PR’s first commit creation date. created_at: (str) - The creation date of the pull request. test_patch: (str) - A test-file patch that was contributed by the solution PR. problem_statement: (str) - The issue title and body. version: (str) - Installation version to use for running evaluation. environment_setup_commit: (str) - commit hash to use for environment setup and installation. FAIL_TO_PASS: (str) - A json list of strings that represent the set of tests resolved by the PR and tied to the issue resolution. PASS_TO_PASS: (str) - A json list of strings that represent tests that should pass before and after the PR application. text: (str) - The generated text according to the retrieval criterion and the style-2 prompt found in [github:SWE-bench](https://github.com/princeton-nlp/SWE-bench). input_ids: (List[int]) - The llama tokens for each text. ``` [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

### 数据集概述 SWE-bench是一款用于评测系统自动解决GitHub议题（Issue）能力的数据集。该数据集从12个热门Python项目中收集了2294条「议题-拉取请求（Pull Request, PR）」配对样本，评测环节以拉取请求实施后的代码行为作为参考解法，通过单元测试（unit test）验证完成。 ### 支持任务与排行榜 SWE-bench提出了一项全新任务：在提供完整代码仓库与GitHub议题的前提下完成议题修复。其公开排行榜可访问www.swebench.com。 ### 语言说明本数据集的文本以英文为主，且未针对语言类型进行任何过滤或清洗操作。 ## 数据集结构 ### 数据实例 SWE-bench单条数据样本的格式示例如下： instance_id（字符串类型）：格式化的样本标识符，通常格式为「仓库所有者__仓库名称-PR编号」。 patch（字符串类型）：黄金补丁（gold patch），即由该拉取请求生成的、解决了对应议题的代码补丁（已剔除与测试相关的代码）。 repo（字符串类型）：GitHub上的仓库所有者与名称标识符。 base_commit（字符串类型）：仓库的提交哈希值（commit hash），代表解决方案拉取请求应用前的仓库HEAD指针指向状态。 hints_text（字符串类型）：在解决方案拉取请求的首次提交创建日期之前，针对该议题留下的评论内容。 created_at（字符串类型）：该拉取请求的创建日期。 test_patch（字符串类型）：由该解决方案拉取请求贡献的测试文件补丁。 problem_statement（字符串类型）：议题的标题与正文内容。 version（字符串类型）：运行评测时需使用的安装版本。 environment_setup_commit（字符串类型）：用于环境搭建与安装的提交哈希值。 FAIL_TO_PASS（字符串类型）：JSON格式的字符串列表，代表该拉取请求解决的、与议题修复相关的测试用例集合（即修复前失败、修复后通过的测试）。 PASS_TO_PASS（字符串类型）：JSON格式的字符串列表，代表在拉取请求应用前后均应保持通过的测试用例。 text（字符串类型）：基于检索准则与[github:SWE-bench](https://github.com/princeton-nlp/SWE-bench)中定义的风格2提示生成的文本。 input_ids（整数列表类型）：对应文本的Llama分词Token序列。 [需补充更多信息](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

maas

创建时间：

2025-08-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集