Crab-SFT
收藏魔搭社区2025-08-16 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/THU-KEG/Crab-SFT
下载链接
链接失效反馈官方服务:
资源简介:
# Crab SFT Dataset: Dataset used for SFT stage training
<!-- Provide a quick summary of what the model is/does. -->
<p align="justify">
Large language models (LLMs) struggle to follow instructions with complex constraints in format, length, etc. Following the conventional instruction-tuning practice, previous works conduct post-training on complex instruction-response pairs generated by feeding complex instructions to advanced LLMs. However, even advanced LLMs cannot follow complex instructions well, thus limiting the quality of generated data. In this work, we find that <b><i>existing datasets inherently contain implicit complex constraints</i></b> and propose a novel data generation technique, <b><i>constraint back-translation</i></b>. Specifically, we take the high-quality instruction-response pairs in existing datasets and only adopt advanced LLMs to add complex constraints already met by the responses to the instructions, which naturally reduces costs and data noise. In the experiments, we adopt Llama3-70B-Instruct to back-translate constraints and create a high-quality complex instruction-response dataset, named <b>CRAB</b>. We present that post-training on <font face="Verdana">CRAB</font> improves multiple backbone LLMs' complex instruction-following ability, evaluated on extensive instruction-following benchmarks. We further find that constraint back-translation also serves as a useful auxiliary training objective in post-training.
- 📖 Paper: [Constraint Back-translation Improves Complex Instruction Following of Large Language Models](https://arxiv.org/abs/2410.24175)
</p>
- 🦀 Github: [THU/Crab](https://github.com/THU-KEG/Crab)
### Data Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li
- **Language(s) (NLP):** English
# 螃蟹监督微调数据集(Crab SFT Dataset):用于监督微调(Supervised Fine-Tuning,SFT)阶段训练的数据集
<!-- 简要说明该数据集的用途与特性 -->
<p align="justify">
大语言模型(Large Language Model,LLM)难以遵循格式、长度等带有复杂约束的指令。遵循常规的指令微调范式,此前的研究工作将复杂指令输入至先进大语言模型以生成复杂的指令-回复对,并基于此类数据开展后训练。然而,即便先进的大语言模型也难以很好地遵循复杂指令,这限制了生成数据的质量。本研究发现<b><i>现有数据集本质上蕴含着隐性复杂约束</i></b>,并提出了一种全新的数据生成技术——<b><i>约束反向翻译(constraint back-translation)</i></b>。具体而言,我们选取现有数据集中的高质量指令-回复对,仅借助先进大语言模型为指令添加其回复已满足的复杂约束,这一方法自然降低了训练成本与数据噪声。在实验环节,我们采用Llama3-70B-Instruct模型执行约束反向翻译,构建了一个高质量的复杂指令-回复数据集,命名为<b>CRAB</b>。我们通过广泛的指令遵循基准测试验证发现,基于<font face="Verdana">CRAB</font>开展后训练,能够提升多款基础大语言模型的复杂指令遵循能力。此外,我们还发现约束反向翻译同样可作为后训练中一项有效的辅助训练目标。
</p>
- 📖 论文:[《约束反向翻译提升大语言模型的复杂指令遵循能力》](https://arxiv.org/abs/2410.24175)
- 🦀 GitHub:[THU/Crab](https://github.com/THU-KEG/Crab)
### 数据集详情
<!-- 详细说明该数据集的相关特性 -->
- **开发者:** 戚云佳、彭浩、王志智、徐斌、侯磊、李涓子
- **NLP所用语言:** 英语
提供机构:
maas
创建时间:
2025-07-15



