DangerousQA
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/declare-lab/red-instruct
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了200个有毒问题,覆盖了包括种族歧视、刻板印象、性别歧视和非法内容等不同有害类别。此外,该数据集旨在评估语言模型对于有毒和有害问题的回应,规模为200个问题,任务是对语言模型的安全性进行评估。
This dataset contains 200 toxic questions covering a variety of harmful categories including racial discrimination, stereotypes, gender discrimination and illegal content. Furthermore, this dataset is designed to evaluate the responses of language models to toxic and harmful questions, with a total of 200 questions, and its primary task is to assess the safety of language models.



