DCAgent/SERAlike-swesmith-25k
收藏Hugging Face2026-04-25 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/DCAgent/SERAlike-swesmith-25k
下载链接
链接失效反馈官方服务:
资源简介:
# SERAlike-swesmith-25k
A drop-in replacement for [`DCAgent/swesmith-sandboxes-with_tests-25k`](https://huggingface.co/datasets/DCAgent/swesmith-sandboxes-with_tests-25k) where the per-row `instruction.md` is rewritten from the original specific SWE-Smith bug report into a SERA-style **vague, "patch or abstain"** prompt.
## Why
[SERA (2026)](https://arxiv.org/abs/2601.20789) trains agents on synthetic data generated by the **Soft Verified Generation** (SVG) inversion: the teacher is given a vague bug-type hint, the agent invents a bug-fix, and the model's output is taken as ground truth (the "bug" being whatever the model decided to change). SERA's published `Sera-4.6-Lite-T2-v4` corpus is generated from the SWE-Smith codebase set (121 repos, per the paper).
This dataset reuses SWE-Smith's existing real-bug task tarballs (Dockerfile + tests + solution all preserved) but replaces the instruction with a SERA-style vague prompt drawn from one of [SERA's 51 ROLLOUT_ONE_PROMPTS](https://github.com/allenai/SERA/blob/main/sera/constants.py). The agent is then explicitly given an **abstain branch** in the prompt: if it cannot identify a real change, it outputs `<abstain/>` and stops.
The point: at datagen time we get the vagueness of SERA's prompt distribution while keeping SWE-Smith's real test-based verifier — a strict improvement over SVG's line-level recall, since training only retains traces where the existing tests still pass after the agent's patch.
## Schema
| column | type | notes |
|---|---|---|
| `path` | string | row id (`swesmith-XXXXXX`), inherited from upstream |
| `task_binary` | bytes | gzipped tar with the same layout as upstream — `instruction.md` (rewritten), `task.toml`, `environment/Dockerfile`, `tests/*`, `solution/*` |
| `bug_label` | string | which of SERA's 51 bug categories was sampled (e.g. `state_management`, `edge_cases`) |
| `source_dataset` | string | `DCAgent/swesmith-sandboxes-with_tests-25k` |
The bug label is sampled deterministically from `sha256(path)` so the rewrite is reproducible.
## What changes vs upstream
- `instruction.md` — replaced.
- Everything else — `task.toml`, `environment/Dockerfile`, `tests/test.sh`, `tests/test_state.py`, `tests/config.json`, `solution/solve.sh` — **unchanged**, including the SWE-Smith Docker image reference and the buggy commit checkout.
This means agents see SERA-style vague prompts but the underlying ground-truth bug + the existing test verifier are SWE-Smith's. Patches that don't fix what's actually broken should still fail at verification time, so post-hoc trace filtering is straightforward.
## Generation
```bash
python -m data.seralike.rewrite # builds shards into ./out/
python -m data.seralike.upload --repo DCAgent/SERAlike-swesmith-25k
```
Source: `OpenThoughts-Agent/data/seralike/`.
提供机构:
DCAgent



