princeton-nlp/SWE-bench_bm25_13k_cl100k
收藏Hugging Face2024-04-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/princeton-nlp/SWE-bench_bm25_13k_cl100k
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: base_commit
dtype: string
- name: hints_text
dtype: string
- name: created_at
dtype: string
- name: test_patch
dtype: string
- name: repo
dtype: string
- name: problem_statement
dtype: string
- name: version
dtype: string
- name: instance_id
dtype: string
- name: FAIL_TO_PASS
dtype: string
- name: PASS_TO_PASS
dtype: string
- name: environment_setup_commit
dtype: string
- name: text
dtype: string
- name: input_ids
sequence: int32
- name: labels
sequence: int64
- name: patch
dtype: string
splits:
- name: test
num_bytes: 278496488
num_examples: 2294
download_size: 114205622
dataset_size: 278496488
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
---
### Dataset Summary
SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.
### Supported Tasks and Leaderboards
SWE-bench proposes a new task: issue resolution provided a full repository and GitHub issue. The leaderboard can be found at www.swebench.com
### Languages
The text of the dataset is primarily English, but we make no effort to filter or otherwise clean based on language type.
## Dataset Structure
### Data Instances
An example of a SWE-bench datum is as follows:
```
instance_id: (str) - A formatted instance identifier, usually as repo_owner__repo_name-PR-number.
patch: (str) - The gold patch, the patch generated by the PR (minus test-related code), that resolved the issue.
repo: (str) - The repository owner/name identifier from GitHub.
base_commit: (str) - The commit hash of the repository representing the HEAD of the repository before the solution PR is applied.
hints_text: (str) - Comments made on the issue prior to the creation of the solution PR’s first commit creation date.
created_at: (str) - The creation date of the pull request.
test_patch: (str) - A test-file patch that was contributed by the solution PR.
Problem_statement: (str) - The issue title and body.
Version: (str) - Installation version to use for running evaluation.
environment_setup_commit: (str) - commit hash to use for environment setup and installation.
FAIL_TO_PASS: (str) - A json list of strings that represent the set of tests resolved by the PR and tied to the issue resolution.
PASS_TO_PASS: (str) - A json list of strings that represent tests that should pass before and after the PR application.
text: (str) - The generated text according to the retrieval criterion and the style-2 prompt found in [github:SWE-bench](https://github.com/princeton-nlp/SWE-bench).
input_ids: (List[int]) - The cl100k_base tokens for each text.
```
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
princeton-nlp
原始信息汇总
数据集概述
SWE-bench 是一个测试系统自动解决 GitHub 问题的数据集。该数据集收集了 2,294 个 Issue-Pull Request 对,来自 12 个流行的 Python 仓库。评估通过使用 PR 后的行为作为参考解决方案的单元测试验证进行。
支持的任务和排行榜
SWE-bench 提出了一项新任务:在提供完整仓库和 GitHub Issue 的情况下解决 Issue。排行榜可以在 www.swebench.com 找到。
语言
数据集的文本主要是英语,但没有根据语言类型进行过滤或清理。
数据集结构
数据实例
SWE-bench 的一个数据实例如下:
instance_id(str) - 格式化的实例标识符,通常为 repo_owner__repo_name-PR-number。patch(str) - 解决问题的黄金补丁,即 PR 生成的补丁(减去与测试相关的代码)。repo(str) - GitHub 上的仓库所有者/名称标识符。base_commit(str) - 表示解决方案 PR 应用之前仓库 HEAD 的提交哈希。hints_text(str) - 解决方案 PR 的第一个提交创建日期之前的 Issue 评论。created_at(str) - Pull Request 的创建日期。test_patch(str) - 解决方案 PR 贡献的测试文件补丁。Problem_statement(str) - Issue 的标题和正文。Version(str) - 用于运行评估的安装版本。environment_setup_commit(str) - 用于环境设置和安装的提交哈希。FAIL_TO_PASS(str) - 表示 PR 解决并与 Issue 解决相关的测试集的 JSON 字符串列表。PASS_TO_PASS(str) - 表示 PR 应用前后应通过的测试的 JSON 字符串列表。text(str) - 根据检索标准和 github:SWE-bench 中找到的 style-2 提示生成的文本。input_ids(List[int]) - 每个文本的 cl100k_base 令牌。
数据分割
test分割包含 2294 个样本,总字节数为 278496488。
数据集大小
- 下载大小:114205622 字节
- 数据集大小:278496488 字节
配置
default配置包含test分割的数据文件路径为data/test-*。
搜集汇总
数据集介绍

背景与挑战
背景概述
SWE-bench_bm25_13k_cl100k是一个软件工程基准测试数据集,专注于评估系统自动解决GitHub问题的能力。该数据集包含2,294个Issue-Pull Request对,源自12个流行的Python仓库,通过单元测试验证来评估PR解决方案的有效性。数据集以parquet格式存储,规模适中(1K-10K),仅提供测试集用于任务评估。
以上内容由遇见数据集搜集并总结生成



