Name: OliverSlivka/itemset-extraction-v3
Creator: OliverSlivka
Published: 2026-03-18 16:23:44
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/OliverSlivka/itemset-extraction-v3

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: sft features: - name: messages list: - name: content dtype: string - name: role dtype: string splits: - name: train num_examples: 245 - name: validation num_examples: 27 - config_name: dpo features: - name: prompt list: - name: content dtype: string - name: role dtype: string - name: chosen list: - name: content dtype: string - name: role dtype: string - name: rejected list: - name: content dtype: string - name: role dtype: string splits: - name: train num_examples: 546 - name: validation num_examples: 60 - config_name: grpo features: - name: prompt list: - name: content dtype: string - name: role dtype: string - name: ground_truth dtype: string splits: - name: train num_examples: 245 - name: validation num_examples: 27 configs: - config_name: sft data_files: - split: train path: sft/train-* - split: validation path: sft/validation-* - config_name: dpo data_files: - split: train path: dpo/train-* - split: validation path: dpo/validation-* - config_name: grpo data_files: - split: train path: grpo/train-* - split: validation path: grpo/validation-* license: apache-2.0 task_categories: - text-generation language: - en tags: - frequent-itemset-mining - fine-tuning - chain-of-thought - v3 --- # Itemset Extraction Training Dataset — v3 **Version:** v3.10 (2026-03-18) **Model target:** Qwen2.5-7B-Instruct ## What's New in v3 (vs v2) | Aspect | v2 | v3 | |--------|----|----| | SFT format | Verbose `Row N` in think block | **Concise column-grouped**, spaced `R1, R10, R2` | | SFT examples | 348 (314/34 split) | **272** (245/27 split, tokenizer-verified ≤4096) | | R-ref format | N/A (Row N) | **Spaced** `R1, R10` (clean tokenization) | | Token filter | chars/4 estimate | **Actual Qwen tokenizer** (0 examples >4096) | | DPO pairs | 606 (546/60) | 606 (546/60) — unchanged | ## Configs ### `sft` — Supervised Fine-Tuning with Chain-of-Thought - 245 train / 27 val examples - Format: `{messages: [{role, content}]}` with `<think>` reasoning ### `dpo` — Direct Preference Optimization - 546 train / 60 val pairs - Chosen = Apriori ground truth, Rejected = real LLM failures from 4 models ### `grpo` — Group Relative Policy Optimization - 245 train / 27 val (reuses SFT prompts with ground_truth JSON) ## Usage ```python from datasets import load_dataset sft = load_dataset("OliverSlivka/itemset-extraction-v3", "sft") dpo = load_dataset("OliverSlivka/itemset-extraction-v3", "dpo") grpo = load_dataset("OliverSlivka/itemset-extraction-v3", "grpo") ``` ## Training Notebook Download from this repo: `notebooks/training_3phase_7b.ipynb` ## Version History - **v3** (2026-03-18): Fixed R-shorthand tokenization, concise CoT format, tokenizer-verified lengths - **v2** (2026-03-07): Original verbose format, 348 SFT examples → [itemset-extraction-v2](https://huggingface.co/datasets/OliverSlivka/itemset-extraction-v2)

应用场景：