JinnP/amdpilot-lora-sft-dataset
收藏Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/JinnP/amdpilot-lora-sft-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
language:
- en
tags:
- amd
- rocm
- triton
- kernel-optimization
- agent-trajectories
- tool-calling
size_categories:
- n<1K
---
# amdpilot LoRA SFT Dataset
Multi-turn agent trajectories for fine-tuning LLMs on AMD GPU kernel engineering tasks. Used to train the [Qwen3.5-397B-A17B LoRA SFT](https://huggingface.co/JinnP/Qwen3.5-397B-A17B-LoRA-SFT-v4) adapter.
## Dataset Overview
| Stat | Value |
|------|-------|
| Raw trajectories | 104 |
| Task types | 94 KernelBench Triton optimization + 4 SGLang/vLLM bugfix + 4 frontier bugfix + 2 executor tasks |
| Format | OpenAI function-calling (role/content/tool_calls/tool_call_id) |
| Token range | 8.6K - 101K per trajectory (median 54K) |
| Tools | ReadFile, WriteFile, Shell, plus analysis |
## Files
### Raw Data
| File | Description | Examples |
|------|-------------|----------|
| `raw_trajectories.jsonl` | All 104 unprocessed agent trajectories | 104 |
### v4 Data (recommended -- fixed role alternation + leak-free eval)
Processed by `convert_amdpilot_sft.py` (v4). Key fixes from v2: removed separator/recap messages that broke LlamaFactory's OpenAI converter validation (which silently dropped 66% of v2/v3 training data), and added trajectory-level eval split with zero task overlap.
| File | Description | Examples |
|------|-------------|----------|
| `v4_train.jsonl` | 3 views (bookend + full + solution) from 92 train trajectories | 270 |
| `v4_eval.jsonl` | Full trajectories from 10 held-out tasks (zero overlap with train) | 10 |
### v2 Data (legacy -- has role alternation bugs)
| File | Description | Examples |
|------|-------------|----------|
| `v2_combined_train.jsonl` | 3 views merged (but bookend/chunk views silently dropped by converter) | 296 |
| `v2_combined_eval.jsonl` | Bookend eval (broken -- fails converter validation) | 10 |
| `v2_full_train.jsonl` | Full trajectories only | 90 |
| `v2_full_eval.jsonl` | Full eval (leaked -- identical to train examples) | 10 |
| `v2_chunks_train.jsonl` | Solution chunks (silently dropped by converter) | 1,043 |
| `v2_chunks_eval.jsonl` | Chunks eval | 115 |
### Other
| File | Description |
|------|-------------|
| `convert_amdpilot_sft.py` | v4 data prep script (3-view extraction with proper role alternation) |
| `stats.json` | Token statistics for the raw dataset |
## Three-View Extraction
**View 1 -- Bookend:** Task prefix (system + user prompt) + final solution suffix. Direct concatenation, no separator. Teaches "given this task, here is the solution."
**View 2 -- Full:** Complete trajectory (truncated at training time by cutoff_len). Teaches problem analysis, debugging approach, iterative refinement.
**View 3 -- Solution Chunks:** Task prefix + last 6-8 turns containing the successful solution. Teaches the final debugging/iteration pattern.
## v2 vs v4 Data Bug
v2/v3 injected fake `user`-role separator messages between prefix and suffix in bookend/chunk views. This violated LlamaFactory's strict role alternation and caused 66% of training examples to be silently dropped. v4 removes these separators and validates all examples pass the converter. See the [Notion page](https://www.notion.so/329651cb22e580ed82b6e78af7ecae6b) for full analysis.
## Training Versions
| Version | Dataset | Effective Examples | Train Loss | Eval Loss | Adapter |
|---------|---------|-------------------|-----------|-----------|---------|
| v1 | v1 naive chunks | ~100 | 0.163 | n/a | [v1](https://huggingface.co/JinnP/Qwen3.5-397B-A17B-LoRA-SFT-v1) |
| v2 | v2_combined (broken) | ~100 (66% dropped) | 0.085 | n/a | [v2](https://huggingface.co/JinnP/Qwen3.5-397B-A17B-LoRA-SFT-v2) |
| v3 | v2_combined (broken) | ~100 (66% dropped) | 0.059 | 0.044 (leaked) | [v3](https://huggingface.co/JinnP/Qwen3.5-397B-A17B-LoRA-SFT-v3) |
| **v4** | **v4 (fixed)** | **270** | **0.199** | **0.055 (clean)** | [**v4**](https://huggingface.co/JinnP/Qwen3.5-397B-A17B-LoRA-SFT-v4) |
提供机构:
JinnP



