TuringEnterprises/SWE-Bench-plus-plus

Name: TuringEnterprises/SWE-Bench-plus-plus
Creator: TuringEnterprises
Published: 2025-12-30 02:18:08
License: 暂无描述

Hugging Face2025-12-30 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/TuringEnterprises/SWE-Bench-plus-plus

下载链接

链接失效反馈

官方服务：

资源简介：

SWE-bench++是一个重新构想的创新性端到端评估框架，旨在解决现有软件工程评估中的痛点，并引入新功能。该数据集包含500个高质量任务，覆盖多种编程语言和仓库类型，其中80%以上的任务属于中等至高等难度。这些任务平均涉及120多行代码的修改（部分任务甚至超过1000行）和7个以上文件的编辑。数据集通过自动化管道构建，包括可扩展的源筛选、智能数据整理、基于代理的Docker化以及自动质量控制。SWE-bench++不仅为软件推理评估和训练设立了新标准，还能推广到其他更全面的软件工程任务评估。

SWE-bench++ is a reenvisioned, innovative, end-to-end evaluation framework that addresses existing evaluation pain points and introduces new capabilities. The dataset includes 500 high-quality tasks across diverse programming languages and repository types, with over 80% of tasks in the medium-to-hard difficulty range. These tasks average over 120 lines of code edited (with some exceeding 1000 lines) and more than 7 files edited. The dataset is constructed through an automated pipeline involving scalable sourcing and filtering, intelligent data curation, agentic Dockerization, and automated quality control. SWE-bench++ sets a new standard for evaluating and training software reasoning capabilities and can be generalized to evaluate more holistic software engineering tasks.

提供机构：

TuringEnterprises

5,000+

优质数据集

54 个

任务类型

进入经典数据集