princeton-nlp/SWE-bench_bm25_13k_cl100k

Name: princeton-nlp/SWE-bench_bm25_13k_cl100k
Creator: princeton-nlp
Published: 2024-04-15 22:16:51
License: 暂无描述

Hugging Face2024-04-15 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/princeton-nlp/SWE-bench_bm25_13k_cl100k

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: base_commit dtype: string - name: hints_text dtype: string - name: created_at dtype: string - name: test_patch dtype: string - name: repo dtype: string - name: problem_statement dtype: string - name: version dtype: string - name: instance_id dtype: string - name: FAIL_TO_PASS dtype: string - name: PASS_TO_PASS dtype: string - name: environment_setup_commit dtype: string - name: text dtype: string - name: input_ids sequence: int32 - name: labels sequence: int64 - name: patch dtype: string splits: - name: test num_bytes: 278496488 num_examples: 2294 download_size: 114205622 dataset_size: 278496488 configs: - config_name: default data_files: - split: test path: data/test-* --- ### Dataset Summary SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. ### Supported Tasks and Leaderboards SWE-bench proposes a new task: issue resolution provided a full repository and GitHub issue. The leaderboard can be found at www.swebench.com ### Languages The text of the dataset is primarily English, but we make no effort to filter or otherwise clean based on language type. ## Dataset Structure ### Data Instances An example of a SWE-bench datum is as follows:  ``` instance_id: (str) - A formatted instance identifier, usually as repo_owner__repo_name-PR-number. patch: (str) - The gold patch, the patch generated by the PR (minus test-related code), that resolved the issue. repo: (str) - The repository owner/name identifier from GitHub. base_commit: (str) - The commit hash of the repository representing the HEAD of the repository before the solution PR is applied. hints_text: (str) - Comments made on the issue prior to the creation of the solution PR’s first commit creation date. created_at: (str) - The creation date of the pull request. test_patch: (str) - A test-file patch that was contributed by the solution PR. Problem_statement: (str) - The issue title and body. Version: (str) - Installation version to use for running evaluation. environment_setup_commit: (str) - commit hash to use for environment setup and installation. FAIL_TO_PASS: (str) - A json list of strings that represent the set of tests resolved by the PR and tied to the issue resolution. PASS_TO_PASS: (str) - A json list of strings that represent tests that should pass before and after the PR application. text: (str) - The generated text according to the retrieval criterion and the style-2 prompt found in [github:SWE-bench](https://github.com/princeton-nlp/SWE-bench). input_ids: (List[int]) - The cl100k_base tokens for each text. ``` [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

princeton-nlp

原始信息汇总

数据集概述

SWE-bench 是一个测试系统自动解决 GitHub 问题的数据集。该数据集收集了 2,294 个 Issue-Pull Request 对，来自 12 个流行的 Python 仓库。评估通过使用 PR 后的行为作为参考解决方案的单元测试验证进行。

支持的任务和排行榜

SWE-bench 提出了一项新任务：在提供完整仓库和 GitHub Issue 的情况下解决 Issue。排行榜可以在 www.swebench.com 找到。

语言

数据集的文本主要是英语，但没有根据语言类型进行过滤或清理。

数据集结构

数据实例

SWE-bench 的一个数据实例如下：

instance_id (str) - 格式化的实例标识符，通常为 repo_owner__repo_name-PR-number。
patch (str) - 解决问题的黄金补丁，即 PR 生成的补丁（减去与测试相关的代码）。
repo (str) - GitHub 上的仓库所有者/名称标识符。
base_commit (str) - 表示解决方案 PR 应用之前仓库 HEAD 的提交哈希。
hints_text (str) - 解决方案 PR 的第一个提交创建日期之前的 Issue 评论。
created_at (str) - Pull Request 的创建日期。
test_patch (str) - 解决方案 PR 贡献的测试文件补丁。
Problem_statement (str) - Issue 的标题和正文。
Version (str) - 用于运行评估的安装版本。
environment_setup_commit (str) - 用于环境设置和安装的提交哈希。
FAIL_TO_PASS (str) - 表示 PR 解决并与 Issue 解决相关的测试集的 JSON 字符串列表。
PASS_TO_PASS (str) - 表示 PR 应用前后应通过的测试的 JSON 字符串列表。
text (str) - 根据检索标准和 github:SWE-bench 中找到的 style-2 提示生成的文本。
input_ids (List[int]) - 每个文本的 cl100k_base 令牌。

数据分割

test 分割包含 2294 个样本，总字节数为 278496488。

数据集大小

下载大小：114205622 字节
数据集大小：278496488 字节

配置

default 配置包含 test 分割的数据文件路径为 data/test-*。

搜集汇总

数据集介绍

背景与挑战

背景概述

SWE-bench_bm25_13k_cl100k是一个软件工程基准测试数据集，专注于评估系统自动解决GitHub问题的能力。该数据集包含2,294个Issue-Pull Request对，源自12个流行的Python仓库，通过单元测试验证来评估PR解决方案的有效性。数据集以parquet格式存储，规模适中（1K-10K），仅提供测试集用于任务评估。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集