Galtea-AI/galtea-red-teaming-clustered-data
收藏Hugging Face2025-05-19 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Galtea-AI/galtea-red-teaming-clustered-data
下载链接
链接失效反馈官方服务:
资源简介:
Galtea Red Teaming: Non-Commercial Subset数据集是一个专门用于红队训练和大型语言模型安全性评估的对抗性提示集合。该数据集中的所有提示均来自非商业许可下的数据集,经过去重、格式统一和基于语义的自动聚类处理。每个条目包括对抗性指令、来源数据集和基于提示行为的数字聚类ID。该数据集按照行为聚类,使得LLM开发者能够对各种对抗性风格进行基准测试和鲁棒性测试。
The Galtea Red Teaming: Non-Commercial Subset dataset is a collection of adversarial prompts designed for red teaming and large language model safety evaluation. All prompts are sourced from datasets under non-commercial licenses, and have been deduplicated, normalized, and automatically clustered based on semantic meaning. Each entry includes an adversarial instruction, the origin dataset, and a numeric cluster ID based on the prompt behavior. This behavioral clustering allows LLM developers to benchmark and test robustness against various adversarial styles.
提供机构:
Galtea-AI



