DCAgent/SERAlike-swesmith-25k

Name: DCAgent/SERAlike-swesmith-25k
Creator: DCAgent
Published: 2026-04-25 17:38:50
License: 暂无描述

Hugging Face2026-04-25 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/DCAgent/SERAlike-swesmith-25k

下载链接

链接失效反馈

官方服务：

资源简介：

# SERAlike-swesmith-25k A drop-in replacement for [`DCAgent/swesmith-sandboxes-with_tests-25k`](https://huggingface.co/datasets/DCAgent/swesmith-sandboxes-with_tests-25k) where the per-row `instruction.md` is rewritten from the original specific SWE-Smith bug report into a SERA-style **vague, "patch or abstain"** prompt. ## Why [SERA (2026)](https://arxiv.org/abs/2601.20789) trains agents on synthetic data generated by the **Soft Verified Generation** (SVG) inversion: the teacher is given a vague bug-type hint, the agent invents a bug-fix, and the model's output is taken as ground truth (the "bug" being whatever the model decided to change). SERA's published `Sera-4.6-Lite-T2-v4` corpus is generated from the SWE-Smith codebase set (121 repos, per the paper). This dataset reuses SWE-Smith's existing real-bug task tarballs (Dockerfile + tests + solution all preserved) but replaces the instruction with a SERA-style vague prompt drawn from one of [SERA's 51 ROLLOUT_ONE_PROMPTS](https://github.com/allenai/SERA/blob/main/sera/constants.py). The agent is then explicitly given an **abstain branch** in the prompt: if it cannot identify a real change, it outputs `<abstain/>` and stops. The point: at datagen time we get the vagueness of SERA's prompt distribution while keeping SWE-Smith's real test-based verifier — a strict improvement over SVG's line-level recall, since training only retains traces where the existing tests still pass after the agent's patch. ## Schema | column | type | notes | |---|---|---| | `path` | string | row id (`swesmith-XXXXXX`), inherited from upstream | | `task_binary` | bytes | gzipped tar with the same layout as upstream — `instruction.md` (rewritten), `task.toml`, `environment/Dockerfile`, `tests/*`, `solution/*` | | `bug_label` | string | which of SERA's 51 bug categories was sampled (e.g. `state_management`, `edge_cases`) | | `source_dataset` | string | `DCAgent/swesmith-sandboxes-with_tests-25k` | The bug label is sampled deterministically from `sha256(path)` so the rewrite is reproducible. ## What changes vs upstream - `instruction.md` — replaced. - Everything else — `task.toml`, `environment/Dockerfile`, `tests/test.sh`, `tests/test_state.py`, `tests/config.json`, `solution/solve.sh` — **unchanged**, including the SWE-Smith Docker image reference and the buggy commit checkout. This means agents see SERA-style vague prompts but the underlying ground-truth bug + the existing test verifier are SWE-Smith's. Patches that don't fix what's actually broken should still fail at verification time, so post-hoc trace filtering is straightforward. ## Generation ```bash python -m data.seralike.rewrite # builds shards into ./out/ python -m data.seralike.upload --repo DCAgent/SERAlike-swesmith-25k ``` Source: `OpenThoughts-Agent/data/seralike/`.

提供机构：

DCAgent

5,000+

优质数据集

54 个

任务类型

进入经典数据集