CantTalkAboutThis-Topic-Control-Dataset
收藏魔搭社区2025-10-09 更新2025-01-25 收录
下载链接:
https://modelscope.cn/datasets/nv-community/CantTalkAboutThis-Topic-Control-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
# CantTalkAboutThis Topic Control Dataset
## Dataset Details
### Dataset Description
The CantTalkAboutThis dataset is designed to train language models to maintain topical focus during task-oriented dialogues. It includes synthetic dialogues across nine domains (e.g., health, banking, travel) and incorporates distractor turns to test and improve the model's ability to be resilient to distractors. Fine-tuning models on this dataset enhances their ability to maintain topical coherence and improves alignment for both instruction-following and safety tasks.
- **Language(s) (NLP):** English
- **License:** CC-BY-4.0
### Dataset Sources
- **Repository:** [Link](https://github.com/makeshn/topic_following)
- **Paper:** [Link](https://arxiv.org/abs/2404.03820)
- **Demo:** [NVIDIA AI Playground](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control)
## Uses
### Direct Use
This dataset is intended for training and fine-tuning language models to maintain topical relevance in dialogues, useful for creating task-oriented bots. Broadly, the inteded use cases are:
- Training language models to recognize sensitive topics
- Developing topic control mechanisms in conversational AI
- Evaluating AI systems' ability to handle restricted content appropriately
### Out-of-Scope Use
This dataset should not be used to train systems for harmful, unethical, or malicious purposes. This dataset should not be used for:
- Training models to generate harmful or inappropriate content
- Bypassing content moderation systems
This dataset should not be used for:
- Training models to generate harmful or inappropriate content
- Bypassing content moderation systems
- Creating adversarial examples to test system vulnerabilities
## Dataset Structure
The dataset includes 1080 dialogues, with each conversation containing distractor turns. Scenarios are categorized into nine domains - health, banking, travel, education, finance, insurance, legal, real estate, and computer troubleshooting. The various fields in the dataset are:
- `domain`: The domain of the conversation
- `scenario`: The specific scenario or task being discussed
- `system_instruction`: The dialogue policy given to the model and it is usually a complex set of instructions on topics allowed and not allowed.
- `conversation`: The full conversation, including both the main topic and distractor turns
- `distractors`: List of distractor turns. This includes a bot turn from the conversation and the distractor turn from the user that should be included in the conversation as a response to the bot's turn.
- `conversation_with_distractors`: The conversation with the distractor turns included.
### Curation Rationale
The dataset is created to address a gap in existing alignment datasets for topic control. Language models are often trained to be as helpful as possible, which can lead to them straying from the intended topic of the conversation. This dataset is designed to test the ability of language models to maintain topical focus during dialogues and to help train guardrail models to detect when a langauge model is straying from the intended topic.
### Source Data
The dataset is created using apipeline to synthetically generate conversations and distractors. This pipline is described in the accompanying [paper](https://arxiv.org/abs/2404.03820).
This version of the dataset is the commercially friendly version and was generated using the [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model. We additionally provide an evaluation dataset that is human annotated and includes more complex, realistic distractors that can be used to evaluate the performance of models.
#### Personal and Sensitive Information
The dataset does not contain any personal or sensitive information. The data is synthetically generated and is not expected to contain any real world data that is of sensitive nature.
## Bias, Risks, and Limitations
* Biases: The dataset is synthetic, which may lead to limitations in generalizability.
* Risks: Distractors in the dataset are simpler than real-world off-topic deviations, requiring additional human annotations for robustness. The guardrail models trained on this dataset are not expected to be able to detect all off-topic deviations.
### Recommendations
Users should be made aware of the risks, biases and limitations of the dataset.
## Citation
**BibTeX:**
```bibtex
@inproceedings{sreedhar2024canttalkaboutthis,
title={CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues},
author={Sreedhar, Makesh and Rebedea, Traian and Ghosh, Shaona and Zeng, Jiaqi and Parisien, Christopher},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
pages={12232--12252},
year={2024},
organization={Association for Computational Linguistics}
}
```
## Dataset Card Authors
* Makesh Sreedhar
* Traian Rebedea
## Dataset Card Contact
* Makesh Sreedhar {makeshn@nvidia.com}
* Traian Rebedea {trebedea@nvidia.com}
# CantTalkAboutThis 话题管控数据集
## 数据集详情
### 数据集描述
CantTalkAboutThis数据集旨在训练语言模型在面向任务的对话中保持话题聚焦。该数据集涵盖健康、银行、旅行等9个领域的合成对话,并引入干扰轮次,用于测试并提升模型抵御干扰的能力。在该数据集上微调模型,可增强其话题连贯性保持能力,同时优化指令遵循与安全任务的对齐效果。
- **语言(自然语言处理):** 英语
- **授权协议:** CC-BY-4.0
### 数据集来源
- **代码仓库:** [链接](https://github.com/makeshn/topic_following)
- **论文:** [链接](https://arxiv.org/abs/2404.03820)
- **演示:** NVIDIA AI Playground(https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control)
## 用途
### 直接用途
本数据集旨在用于训练和微调语言模型,使其在对话中保持话题相关性,可用于构建面向任务的对话机器人。其核心应用场景包括:
- 训练语言模型识别敏感话题
- 开发对话式人工智能中的话题管控机制
- 评估人工智能系统对受限内容的恰当处理能力
### 超出范围的使用场景
本数据集不得用于训练具有危害性、不道德或恶意目的的系统。禁止将其用于:
- 训练模型生成有害或不当内容
- 绕过内容审核系统
此外,也不得将其用于:
- 训练模型生成有害或不当内容
- 绕过内容审核系统
- 构建用于测试系统漏洞的对抗样本
## 数据集结构
该数据集包含1080段对话,每段对话均包含干扰轮次。对话场景分为9个领域:健康、银行、旅行、教育、金融、保险、法律、房地产以及计算机故障排查。数据集中包含以下字段:
- `domain`:对话所属领域
- `scenario`:讨论的具体场景或任务
- `system_instruction`:赋予模型的对话策略,通常为一系列关于允许与禁止讨论话题的复杂指令
- `conversation`:完整对话内容,涵盖主话题与干扰轮次
- `distractors`:干扰轮次列表,包含对话中的模型回复,以及应作为模型对该回复的回应加入对话的用户干扰轮次
- `conversation_with_distractors`:包含干扰轮次的完整对话
### 数据集构建逻辑
本数据集旨在填补现有话题管控对齐数据集的空白。现有语言模型训练通常以提升帮助性为目标,这可能导致模型偏离对话的预设话题。本数据集用于测试语言模型在对话中保持话题聚焦的能力,并助力训练护栏模型,以检测语言模型偏离预设话题的情况。
### 源数据
本数据集通过流水线合成生成对话与干扰轮次,相关流水线细节可参阅配套[论文](https://arxiv.org/abs/2404.03820)。
本版本数据集为商业友好版,由[Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)模型生成。我们还提供了一份人工标注的评估数据集,其中包含更复杂、更贴近现实的干扰轮次,可用于评估模型性能。
#### 个人与敏感信息
本数据集不包含任何个人或敏感信息。所有数据均为合成生成,预计不会包含任何真实世界的敏感内容。
## 偏差、风险与局限性
* 偏差:本数据集为合成生成,可能存在泛化能力受限的问题。
* 风险:数据集中的干扰轮次相较于现实世界中的离题偏差更为简单,需额外添加人工标注以提升鲁棒性。基于本数据集训练的护栏模型,无法保证能够检测到所有离题偏差。
### 建议
使用者应充分了解本数据集的风险、偏差与局限性。
## 引用
**BibTeX格式:**
bibtex
@inproceedings{sreedhar2024canttalkaboutthis,
title={CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues},
author={Sreedhar, Makesh and Rebedea, Traian and Ghosh, Shaona and Zeng, Jiaqi and Parisien, Christopher},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
pages={12232--12252},
year={2024},
organization={Association for Computational Linguistics}
}
## 数据集卡片作者
* Makesh Sreedhar
* Traian Rebedea
## 数据集卡片联系方式
* Makesh Sreedhar {makeshn@nvidia.com}
* Traian Rebedea {trebedea@nvidia.com}
提供机构:
maas
创建时间:
2025-01-20



