Crab-SFT

Name: Crab-SFT
Creator: maas
Published: 2025-08-16 10:56:50
License: 暂无描述

魔搭社区2025-08-16 更新2025-07-19 收录

下载链接：

https://modelscope.cn/datasets/THU-KEG/Crab-SFT

下载链接

链接失效反馈

官方服务：

资源简介：

# Crab SFT Dataset: Dataset used for SFT stage training  Large language models (LLMs) struggle to follow instructions with complex constraints in format, length, etc. Following the conventional instruction-tuning practice, previous works conduct post-training on complex instruction-response pairs generated by feeding complex instructions to advanced LLMs. However, even advanced LLMs cannot follow complex instructions well, thus limiting the quality of generated data. In this work, we find that existing datasets inherently contain implicit complex constraints and propose a novel data generation technique, constraint back-translation. Specifically, we take the high-quality instruction-response pairs in existing datasets and only adopt advanced LLMs to add complex constraints already met by the responses to the instructions, which naturally reduces costs and data noise. In the experiments, we adopt Llama3-70B-Instruct to back-translate constraints and create a high-quality complex instruction-response dataset, named CRAB. We present that post-training on CRAB improves multiple backbone LLMs' complex instruction-following ability, evaluated on extensive instruction-following benchmarks. We further find that constraint back-translation also serves as a useful auxiliary training objective in post-training. - 📖 Paper: [Constraint Back-translation Improves Complex Instruction Following of Large Language Models](https://arxiv.org/abs/2410.24175) - 🦀 Github: [THU/Crab](https://github.com/THU-KEG/Crab) ### Data Description  - **Developed by:** Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li - **Language(s) (NLP):** English

# 螃蟹监督微调数据集（Crab SFT Dataset）：用于监督微调（Supervised Fine-Tuning，SFT）阶段训练的数据集  大语言模型（Large Language Model，LLM）难以遵循格式、长度等带有复杂约束的指令。遵循常规的指令微调范式，此前的研究工作将复杂指令输入至先进大语言模型以生成复杂的指令-回复对，并基于此类数据开展后训练。然而，即便先进的大语言模型也难以很好地遵循复杂指令，这限制了生成数据的质量。本研究发现现有数据集本质上蕴含着隐性复杂约束，并提出了一种全新的数据生成技术——约束反向翻译（constraint back-translation）。具体而言，我们选取现有数据集中的高质量指令-回复对，仅借助先进大语言模型为指令添加其回复已满足的复杂约束，这一方法自然降低了训练成本与数据噪声。在实验环节，我们采用Llama3-70B-Instruct模型执行约束反向翻译，构建了一个高质量的复杂指令-回复数据集，命名为CRAB。我们通过广泛的指令遵循基准测试验证发现，基于CRAB开展后训练，能够提升多款基础大语言模型的复杂指令遵循能力。此外，我们还发现约束反向翻译同样可作为后训练中一项有效的辅助训练目标。 - 📖 论文：[《约束反向翻译提升大语言模型的复杂指令遵循能力》](https://arxiv.org/abs/2410.24175) - 🦀 GitHub：[THU/Crab](https://github.com/THU-KEG/Crab) ### 数据集详情  - **开发者：** 戚云佳、彭浩、王志智、徐斌、侯磊、李涓子 - **NLP所用语言：** 英语

提供机构：

maas

创建时间：

2025-07-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集