GuardAdvisor/GuardSet

Name: GuardAdvisor/GuardSet
Creator: GuardAdvisor
Published: 2025-10-06 16:25:43
License: 暂无描述

Hugging Face2025-10-06 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/GuardAdvisor/GuardSet

下载链接

链接失效反馈

官方服务：

资源简介：

GuardSet是一个为支持Guardian-as-an-Advisor (GaaA)模型训练和评估而设计的大型多领域语料库。它使用软门控机制，不直接阻止用户请求，而是为原始查询添加风险标签和简洁解释，以指导下游大型语言模型生成更安全、更有用、更合规的响应。该数据集整合了多种有害和良性场景，特别针对鲁棒性和诚实性，以全面提高模型的可靠性。数据集包含200,314个训练样本，分为两个独立的部分，用于两阶段模型训练：监督微调(sft)和强化学习(rl)。

GuardSet is a large-scale, multi-domain corpus designed to support the training and evaluation of the Guardian-as-an-Advisor (GaaA) model. It employs a soft-gating mechanism that does not directly block user requests but predicts a risk label and a concise explanation to guide downstream Large Language Models (LLMs) to generate safer, more useful, and compliant responses. The dataset integrates a variety of harmful and benign scenarios, specifically targeting Robustness and Honesty to comprehensively improve model trustworthiness. It contains 200,314 training samples, divided into two independent splits for the two-stage Guardian model training: Supervised Fine-Tuning (sft) and Reinforcement Learning (rl).

提供机构：

GuardAdvisor

5,000+

优质数据集

54 个

任务类型

进入经典数据集