OliverSlivka/itemset-extraction-v3
收藏Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/OliverSlivka/itemset-extraction-v3
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: sft
features:
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_examples: 245
- name: validation
num_examples: 27
- config_name: dpo
features:
- name: prompt
list:
- name: content
dtype: string
- name: role
dtype: string
- name: chosen
list:
- name: content
dtype: string
- name: role
dtype: string
- name: rejected
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_examples: 546
- name: validation
num_examples: 60
- config_name: grpo
features:
- name: prompt
list:
- name: content
dtype: string
- name: role
dtype: string
- name: ground_truth
dtype: string
splits:
- name: train
num_examples: 245
- name: validation
num_examples: 27
configs:
- config_name: sft
data_files:
- split: train
path: sft/train-*
- split: validation
path: sft/validation-*
- config_name: dpo
data_files:
- split: train
path: dpo/train-*
- split: validation
path: dpo/validation-*
- config_name: grpo
data_files:
- split: train
path: grpo/train-*
- split: validation
path: grpo/validation-*
license: apache-2.0
task_categories:
- text-generation
language:
- en
tags:
- frequent-itemset-mining
- fine-tuning
- chain-of-thought
- v3
---
# Itemset Extraction Training Dataset — v3
**Version:** v3.10 (2026-03-18)
**Model target:** Qwen2.5-7B-Instruct
## What's New in v3 (vs v2)
| Aspect | v2 | v3 |
|--------|----|----|
| SFT format | Verbose `Row N` in think block | **Concise column-grouped**, spaced `R1, R10, R2` |
| SFT examples | 348 (314/34 split) | **272** (245/27 split, tokenizer-verified ≤4096) |
| R-ref format | N/A (Row N) | **Spaced** `R1, R10` (clean tokenization) |
| Token filter | chars/4 estimate | **Actual Qwen tokenizer** (0 examples >4096) |
| DPO pairs | 606 (546/60) | 606 (546/60) — unchanged |
## Configs
### `sft` — Supervised Fine-Tuning with Chain-of-Thought
- 245 train / 27 val examples
- Format: `{messages: [{role, content}]}` with `<think>` reasoning
### `dpo` — Direct Preference Optimization
- 546 train / 60 val pairs
- Chosen = Apriori ground truth, Rejected = real LLM failures from 4 models
### `grpo` — Group Relative Policy Optimization
- 245 train / 27 val (reuses SFT prompts with ground_truth JSON)
## Usage
```python
from datasets import load_dataset
sft = load_dataset("OliverSlivka/itemset-extraction-v3", "sft")
dpo = load_dataset("OliverSlivka/itemset-extraction-v3", "dpo")
grpo = load_dataset("OliverSlivka/itemset-extraction-v3", "grpo")
```
## Training Notebook
Download from this repo: `notebooks/training_3phase_7b.ipynb`
## Version History
- **v3** (2026-03-18): Fixed R-shorthand tokenization, concise CoT format, tokenizer-verified lengths
- **v2** (2026-03-07): Original verbose format, 348 SFT examples → [itemset-extraction-v2](https://huggingface.co/datasets/OliverSlivka/itemset-extraction-v2)
提供机构:
OliverSlivka



