Multi-SWE-bench_mini
收藏魔搭社区2025-12-26 更新2025-04-19 收录
下载链接:
https://modelscope.cn/datasets/ByteDance-Seed/Multi-SWE-bench_mini
下载链接
链接失效反馈官方服务:
资源简介:
## 👋 Overview
To make the benchmark more lightweight and minimize resource consumption, we have introduced the Multi-SWE-bench mini.
This version contains 400 instances, with 50 instances per language (Python, Java, TypeScript, JavaScript, Go, Rust, C, and C++). It covers a range of difficulty levels—easy, medium, and hard—ensuring a balanced and efficient evaluation across multiple languages without excessive resource usage.
The leaderboard can be found at:
https://multi-swe-bench.github.io
## 🧩 Data Instances Structure
An example of a Multi-SWE-bench datum is as follows:
```
org: (str) - Organization name identifier from Github.
repo: (str) - Repository name identifier from Github.
number: (int) - The PR number.
state: (str) - The PR state.
title: (str) - The PR title.
body: (str) - The PR body.
base: (dict) - The target branch information of the PR
resolved_issues: (list) - A json list of strings that represent issues that resolved by PR application.
fix_patch: (str) - A fix-file patch that was contributed by the solution PR.
test_patch: (str) - A test-file patch that was contributed by the solution PR.
fixed_tests: (dict) - A json dict of strings that represent tests that should be fixed after the PR application.
p2p_tests: (dict) - The tests that should pass before and after the PR application.
f2p_tests: (dict) - The tests resolved by the PR and tied to the issue resolution.
s2p_tests: (dict) - The tests that should skip before the PR application, and pass after the PR application.
n2p_tests: (dict) - The tests that did not exist before the PR application and tests that should be passed after the PR application.
run_result: (dict) - Overall run results, including number of tests passed, number of tests failed, etc.
test_patch_result: (dict) - The result after the test patch was applied.
fix_patch_result: (dict) - The result after all the patches were applied.
instance_id: (str) - A formatted instance identifier, usually as org__repo_PR-number.
```
## ⚙️ Usage
```bash
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/datasets/ByteDance-Seed/Multi-SWE-bench_mini
```
## 📚 Citation
```
@misc{zan2025multiswebench,
title={Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving},
author={Daoguang Zan and Zhirong Huang and Wei Liu and Hanwu Chen and Linhao Zhang and Shulin Xin and Lu Chen and Qi Liu and Xiaojian Zhong and Aoyan Li and Siyao Liu and Yongsheng Xiao and Liangqiang Chen and Yuyu Zhang and Jing Su and Tianyu Liu and Rui Long and Kai Shen and Liang Xiang},
year={2025},
eprint={2504.02605},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2504.02605},
}
```
## 📜 License
The dataset is licensed under CC0, subject to any intellectual property rights in the dataset owned by Bytedance. The data is adapted from the listed open source projects; your use of that data must comply with their respective licenses.
| Language | Organization/Repository | Repository Link | Data Link |
| :------- | :------------------------------ | :----------------------------------------------------------- | ------------------------------------------------------------ |
| C | facebook/zstd | [repo_link](https://github.com/facebook/zstd) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/facebook__zstd_dataset.jsonl) |
| C | jqlang/jq | [repo_link](https://github.com/jqlang/jq) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/jqlang__jq_dataset.jsonl) |
| C | ponylang/ponyc | [repo_link](https://github.com/ponylang/ponyc) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/ponylang__ponyc_dataset.jsonl) |
| C++ | catchorg/Catch2 | [repo_link](https://github.com/catchorg/Catch2) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/catchorg__Catch2_dataset.jsonl) |
| C++ | fmtlib/fmt | [repo_link](https://github.com/fmtlib/fmt) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/fmtlib__fmt_dataset.jsonl) |
| C++ | nlohmann/json | [repo_link](https://github.com/nlohmann/json) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/nlohmann__json_dataset.jsonl) |
| C++ | simdjson/simdjson | [repo_link](https://github.com/simdjson/simdjson) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/simdjson__simdjson_dataset.jsonl) |
| C++ | yhirose/cpp-httplib | [repo_link](https://github.com/yhirose/cpp-httplib) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/yhirose__cpp-httplib_dataset.jsonl) |
| Go | cli/cli | [repo_link](https://github.com/cli/cli) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/cli__cli_dataset.jsonl) |
| Go | grpc/grpc-go | [repo_link](https://github.com/grpc/grpc-go) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/grpc__grpc-go_dataset.jsonl) |
| Go | zeromicro/go-zero | [repo_link](https://github.com/zeromicro/go-zero) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/zeromicro__go-zero_dataset.jsonl) |
| Java | alibaba/fastjson2 | [repo_link](https://github.com/alibaba/fastjson2) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/alibaba__fastjson2_dataset.jsonl) |
| Java | elastic/logstash | [repo_link](https://github.com/elastic/logstash) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/elastic__logstash_dataset.jsonl) |
| Java | mockito/mockito | [repo_link](https://github.com/mockito/mockito) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/mockito__mockito_dataset.jsonl) |
| JS | anuraghazra/github-readme-stats | [repo_link](https://github.com/anuraghazra/github-readme-stats) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/anuraghazra__github-readme-stats_dataset.jsonl) |
| JS | axios/axios | [repo_link](https://github.com/axios/axios) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/axios__axios_dataset.jsonl) |
| JS | expressjs/express | [repo_link](https://github.com/expressjs/express) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/expressjs__express_dataset.jsonl) |
| JS | iamkun/dayjs | [repo_link](https://github.com/iamkun/dayjs) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/iamkun__dayjs_dataset.jsonl) |
| JS | Kong/insomnia | [repo_link](https://github.com/Kong/insomnia) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/Kong__insomnia_dataset.jsonl) |
| JS | sveltejs/svelte | [repo_link](https://github.com/sveltejs/svelte) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/sveltejs__svelte_dataset.jsonl) |
| Rust | BurntSushi/ripgrep | [repo_link](https://github.com/BurntSushi/ripgrep) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/BurntSushi__ripgrep_dataset.jsonl) |
| Rust | clap-rs/clap | [repo_link](https://github.com/clap-rs/clap) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/clap-rs__clap_dataset.jsonl) |
| Rust | nushell/nushell | [repo_link](https://github.com/nushell/nushell) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/nushell__nushell_dataset.jsonl) |
| Rust | serde-rs/serde | [repo_link](https://github.com/serde-rs/serde) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/serde-rs__serde_dataset.jsonl) |
| Rust | sharkdp/bat | [repo_link](https://github.com/sharkdp/bat) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/sharkdp__bat_dataset.jsonl) |
| Rust | sharkdp/fd | [repo_link](https://github.com/sharkdp/fd) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/sharkdp__fd_dataset.jsonl) |
| Rust | rayon-rs/rayon | [repo_link](https://github.com/rayon-rs/rayon) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/rayon-rs__rayon_dataset.jsonl) |
| Rust | tokio-rs/bytes | [repo_link](https://github.com/tokio-rs/bytes) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__bytes_dataset.jsonl) |
| Rust | tokio-rs/tokio | [repo_link](https://github.com/tokio-rs/tokio) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__tokio_dataset.jsonl) |
| Rust | tokio-rs/tracing | [repo_link](https://github.com/tokio-rs/tracing) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__tracing_dataset.jsonl) |
| TS | darkreader/darkreader | [repo_link](https://github.com/darkreader/darkreader) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/darkreader__darkreader_dataset.jsonl) |
| TS | mui/material-ui | [repo_link](https://github.com/mui/material-ui) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/mui__material-ui_dataset.jsonl) |
| TS | vuejs/core | [repo_link](https://github.com/vuejs/core) | [data_link](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/vuejs__core_dataset.jsonl) |
### 👋 概述
为使基准测试更轻量化并最大限度降低资源消耗,我们推出了Multi-SWE-bench mini。该版本共包含400个测试实例,每种编程语言对应50个实例,涵盖Python、Java、TypeScript、JavaScript、Go、Rust、C及C++。其覆盖简单、中等、困难三类难度层级,可在资源开销可控的前提下,实现多语言场景下均衡且高效的评估。
排行榜可通过以下链接查看:https://multi-swe-bench.github.io
### 🧩 测试实例结构
以下为Multi-SWE-bench测试实例的示例格式:
org:(字符串类型) —— 来自GitHub的组织名称标识符
repo:(字符串类型) —— 来自GitHub的仓库名称标识符
number:(整数类型) —— 拉取请求(Pull Request,PR)编号
state:(字符串类型) —— PR状态
title:(字符串类型) —— PR标题
body:(字符串类型) —— PR信息正文
base:(字典类型) —— PR的目标分支信息
resolved_issues:(列表类型) —— 字符串形式的JSON列表,表示该PR所解决的问题
fix_patch:(字符串类型) —— 解决方案PR贡献的修复文件补丁
test_patch:(字符串类型) —— 解决方案PR贡献的测试文件补丁
fixed_tests:(字典类型) —— 字符串形式的JSON字典,表示PR应用后需修复的测试用例
p2p_tests:(字典类型) —— 需在PR应用前后均通过的测试用例
f2p_tests:(字典类型) —— 由该PR解决并与问题修复相关联的测试用例
s2p_tests:(字典类型) —— 需在PR应用前跳过、应用后通过的测试用例
n2p_tests:(字典类型) —— PR应用前不存在、应用后需通过的新增测试用例
run_result:(字典类型) —— 整体运行结果,包含通过/失败的测试用例数等信息
test_patch_result:(字典类型) —— 应用测试补丁后的运行结果
fix_patch_result:(字典类型) —— 应用所有补丁后的运行结果
instance_id:(字符串类型) —— 格式化的实例标识符,通常格式为org__repo_PR-number
### ⚙️ 使用方法
bash
# 请确保已安装Git大文件存储(Git LFS),访问https://git-lfs.com了解更多
git lfs install
git clone https://huggingface.co/datasets/ByteDance-Seed/Multi-SWE-bench_mini
### 📚 引用格式
@misc{zan2025multiswebench,
title={Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving},
author={Daoguang Zan, Zhirong Huang, Wei Liu, Hanwu Chen, Linhao Zhang, Shulin Xin, Lu Chen, Qi Liu, Xiaojian Zhong, Aoyan Li, Siyao Liu, Yongsheng Xiao, Liangqiang Chen, Yuyu Zhang, Jing Su, Tianyu Liu, Rui Long, Kai Shen, Liang Xiang},
year={2025},
eprint={2504.02605},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2504.02605},
}
### 📜 授权协议
本数据集采用CC0协议授权,但需遵守字节跳动(Bytedance)对数据集所拥有的知识产权相关条款。本数据集改编自所列开源项目,您使用该数据时需遵守对应项目的开源许可协议。
| 编程语言 | 组织/仓库 | 仓库链接 | 数据链接 |
| :------- | :------------------------------ | :----------------------------------------------------------- | ------------------------------------------------------------ |
| C | facebook/zstd | [仓库链接](https://github.com/facebook/zstd) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/facebook__zstd_dataset.jsonl) |
| C | jqlang/jq | [仓库链接](https://github.com/jqlang/jq) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/jqlang__jq_dataset.jsonl) |
| C | ponylang/ponyc | [仓库链接](https://github.com/ponylang/ponyc) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/c/ponylang__ponyc_dataset.jsonl) |
| C++ | catchorg/Catch2 | [仓库链接](https://github.com/catchorg/Catch2) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/catchorg__Catch2_dataset.jsonl) |
| C++ | fmtlib/fmt | [仓库链接](https://github.com/fmtlib/fmt) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/fmtlib__fmt_dataset.jsonl) |
| C++ | nlohmann/json | [仓库链接](https://github.com/nlohmann/json) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/nlohmann__json_dataset.jsonl) |
| C++ | simdjson/simdjson | [仓库链接](https://github.com/simdjson/simdjson) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/simdjson__simdjson_dataset.jsonl) |
| C++ | yhirose/cpp-httplib | [仓库链接](https://github.com/yhirose/cpp-httplib) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/cpp/yhirose__cpp-httplib_dataset.jsonl) |
| Go | cli/cli | [仓库链接](https://github.com/cli/cli) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/cli__cli_dataset.jsonl) |
| Go | grpc/grpc-go | [仓库链接](https://github.com/grpc/grpc-go) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/grpc__grpc-go_dataset.jsonl) |
| Go | zeromicro/go-zero | [仓库链接](https://github.com/zeromicro/go-zero) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/go/zeromicro__go-zero_dataset.jsonl) |
| Java | alibaba/fastjson2 | [仓库链接](https://github.com/alibaba/fastjson2) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/alibaba__fastjson2_dataset.jsonl) |
| Java | elastic/logstash | [仓库链接](https://github.com/elastic/logstash) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/elastic__logstash_dataset.jsonl) |
| Java | mockito/mockito | [仓库链接](https://github.com/mockito/mockito) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/java/mockito__mockito_dataset.jsonl) |
| JS | anuraghazra/github-readme-stats | [仓库链接](https://github.com/anuraghazra/github-readme-stats) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/anuraghazra__github-readme-stats_dataset.jsonl) |
| JS | axios/axios | [仓库链接](https://github.com/axios/axios) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/axios__axios_dataset.jsonl) |
| JS | expressjs/express | [仓库链接](https://github.com/expressjs/express) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/expressjs__express_dataset.jsonl) |
| JS | iamkun/dayjs | [仓库链接](https://github.com/iamkun/dayjs) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/iamkun__dayjs_dataset.jsonl) |
| JS | Kong/insomnia | [仓库链接](https://github.com/Kong/insomnia) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/Kong__insomnia_dataset.jsonl) |
| JS | sveltejs/svelte | [仓库链接](https://github.com/sveltejs/svelte) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/js/sveltejs__svelte_dataset.jsonl) |
| Rust | BurntSushi/ripgrep | [仓库链接](https://github.com/BurntSushi/ripgrep) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/BurntSushi__ripgrep_dataset.jsonl) |
| Rust | clap-rs/clap | [仓库链接](https://github.com/clap-rs/clap) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/clap-rs__clap_dataset.jsonl) |
| Rust | nushell/nushell | [仓库链接](https://github.com/nushell/nushell) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/nushell__nushell_dataset.jsonl) |
| Rust | serde-rs/serde | [仓库链接](https://github.com/serde-rs/serde) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/serde-rs__serde_dataset.jsonl) |
| Rust | sharkdp/bat | [仓库链接](https://github.com/sharkdp/bat) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/sharkdp__bat_dataset.jsonl) |
| Rust | sharkdp/fd | [仓库链接](https://github.com/sharkdp/fd) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/sharkdp__fd_dataset.jsonl) |
| Rust | rayon-rs/rayon | [仓库链接](https://github.com/rayon-rs/rayon) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/rayon-rs__rayon_dataset.jsonl) |
| Rust | tokio-rs/bytes | [仓库链接](https://github.com/tokio-rs/bytes) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__bytes_dataset.jsonl) |
| Rust | tokio-rs/tokio | [仓库链接](https://github.com/tokio-rs/tokio) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__tokio_dataset.jsonl) |
| Rust | tokio-rs/tracing | [仓库链接](https://github.com/tokio-rs/tracing) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/rust/tokio-rs__tracing_dataset.jsonl) |
| TS | darkreader/darkreader | [仓库链接](https://github.com/darkreader/darkreader) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/darkreader__darkreader_dataset.jsonl) |
| TS | mui/material-ui | [仓库链接](https://github.com/mui/material-ui) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/mui__material-ui_dataset.jsonl) |
| TS | vuejs/core | [仓库链接](https://github.com/vuejs/core) | [数据链接](https://huggingface.co/datasets/bytedance-research/Multi-SWE-Bench/blob/main/ts/vuejs__core_dataset.jsonl) |
提供机构:
maas
创建时间:
2025-04-18



