five

Dormamu-Labs/Dormamu_GVD-1_Guardrail_Vulnerability_Dataset_for_LLM_Alignment

收藏
Hugging Face2025-10-27 更新2025-11-30 收录
下载链接:
https://hf-mirror.com/datasets/Dormamu-Labs/Dormamu_GVD-1_Guardrail_Vulnerability_Dataset_for_LLM_Alignment
下载链接
链接失效反馈
官方服务:
资源简介:
Dormamu GVD-1是一个用于大型语言模型(LLM)对齐的全面测试用例集合,旨在评估LLM对对抗性提示的抵抗能力、维持伦理边界以及生成安全输出的能力。该数据集支持强化学习自人类反馈(RLHF)训练,提供了结构化的偏好对,以帮助开发者微调模型,提高在现实世界应用中的安全性、公平性和可靠性。

Dormamu GVD-1 is a comprehensive collection of test cases designed for LLM alignment, evaluating the models ability to resist adversarial prompts, maintain ethical boundaries, and produce safe outputs. The dataset supports Reinforcement Learning from Human Feedback (RLHF) training by providing structured preference pairs, assisting developers in fine-tuning models for enhanced safety, fairness, and reliability in real-world applications.
提供机构:
Dormamu-Labs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作