Luoberta/cve_train_v1.1
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Luoberta/cve_train_v1.1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
language:
- en
tags:
- security
- cve
- vulnerability
- agent-traces
- sft
- code
size_categories:
- 10K<n<100K
datasets:
- Luoberta/cve_train
---
# CVE-Factory Agent Traces v1.1
This dataset is an expanded version of [cve_train](https://huggingface.co/datasets/Luoberta/cve_train), containing **18,783 distilled agent traces** for CVE reproduction tasks. The traces were generated using **Claude Opus 4.5** with a **Mini SWE-Agent** harness through the [CVE-Factory](https://github.com/livecvebench/CVE-Factory) pipeline.
## What's New in v1.1
Compared to [cve_train (v1.0)](https://huggingface.co/datasets/Luoberta/cve_train):
- **18.8k total samples** (up from ~4k in v1.0)
- **+3k agentic tasks** from [cve_tasks_3k_compressed](https://huggingface.co/datasets/Luoberta/cve_tasks_3k_compressed)
- Additional traces from expanded CVE task coverage
## Training Results
Fine-tuning on this dataset yields significant improvements across security benchmarks:
| Model | LiveCVEBench | PatchEval | Terminal-Bench-2.0 | Avg |
| :--- | :---: | :---: | :---: | :---: |
| Qwen3-32B (base) | 8.96 | 5.64 | 5.41 | 6.67 |
| Abacus-cve (v1.0, 4k data) | 36.50 | 21.94 | 20.14 | 26.19 |
| **[Abacus-cve-v1.1](https://huggingface.co/Luoberta/Abacus-cve-v1.1) (Ours, 18.8k data)** | **40.33** | **24.32** | **21.57** | **28.74** |
| | | | | |
| Qwen3-Coder-30B | 11.29 | 9.25 | 11.01 | 10.51 |
| Qwen3-Coder-480B | 29.14 | 18.06 | 25.17 | 24.12 |
| Claude Sonnet 4 | 34.79 | 24.76 | 26.52 | 28.69 |
**Key findings:**
- **v1.1 vs v1.0**: +3.83 on LiveCVEBench, +2.38 on PatchEval, +1.43 on Terminal-Bench-2.0
- **Scaling potential**: Performance gains from 4k to 18.8k traces demonstrate continued improvement with more data
- **Competitive performance**: [Abacus-cve-v1.1](https://huggingface.co/Luoberta/Abacus-cve-v1.1) (32B) matches Claude Sonnet 4 level on security tasks
## Dataset Format
Each line is a JSON object with:
```json
{
"task_id": "cve-2017-15197.2-of-5.2026-01-25__22-10-14",
"is_resolved": true,
"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."},
...
]
}
```
- `task_id`: Unique task identifier (CVE ID + trace index + timestamp)
- `is_resolved`: Whether the task was successfully completed
- `messages`: Conversation history in standard chat format (system/user/assistant turns)
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("Luoberta/cve_train_v1.1")
# Access a sample
sample = dataset["train"][0]
print(f"Task: {sample['task_id']}")
print(f"Resolved: {sample['is_resolved']}")
print(f"Turns: {len(sample['messages'])}")
```
## Related Resources
- **[Abacus-cve-v1.1](https://huggingface.co/Luoberta/Abacus-cve-v1.1)** - Model fine-tuned on this dataset
- **[cve_train (v1.0)](https://huggingface.co/datasets/Luoberta/cve_train)** - Original dataset version
- **[Leaderboard](https://livecvebench.github.io/)** - Live rankings on LiveCVEBench
- **[LiveCVEBench](https://github.com/livecvebench/LiveCVEBench-Preview)** - Security vulnerability benchmark
- **[CVE-Factory](https://github.com/livecvebench/CVE-Factory)** - The multi-agent system that generated these traces
## Citation
```bibtex
@misc{luo2026cvefactory,
title={CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability},
author={Xianzhen Luo and Jingyuan Zhang and Shiqi Zhou and Rain Huang and Chuan Xiao and Qingfu Zhu and Zhiyuan Ma and Xing Yue and Yang Yue and Wencong Zeng and Wanxiang Che},
year={2026},
eprint={2602.03012},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2602.03012}
}
```
提供机构:
Luoberta



