A Chinese Dataset for Evaluating the Safeguards in Large Language Models
收藏arXiv2024-02-19 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2402.12193v1
下载链接
链接失效反馈官方服务:
资源简介:
用于评估大型语言模型安全机制的中文数据集,旨在填补非英语语言模型安全评估的空白,并扩展到其他场景以识别风险提示拒绝的假阴性和假阳性示例。
A Chinese dataset for evaluating the safety mechanisms of large language models (LLMs), which aims to fill the gap in safety evaluation of non-English language models and is extended to other scenarios to identify false negative and false positive examples of risky prompt refusals.
创建时间:
2024-02-19



